Thinking about the truth that HBase stores each column family inside a separate HFile cheap a row can span many Column Families. So how exactly does HBase make sure that a put/remove operation on the row that spans multiple column families is definitely atomic ?

All creates towards the a row, regardless of the number of column families may be for the reason that row, visit one regionserver, which regionserver then creates the edit towards the regions WAL (Hlog), then your creates are sync'd, then your information is put into the memstore so it will likely be offered. Then - when the memstore has hit its limit - the memstore be flushed to disk. If any problems happen to the regionserver also it crashes/dies/has got the plug drawn the WAL could be tell you to help keep everything consistant. For additional gory particulars begin to see the HBASE-2283 and Hbase Architecture 101.

HBase presently accomplishes row-level atomicity regardless of writing multiple HFiles by eliminating all column families simultaneously. The flush is triggered once the greatest column family reaches the set up flush size. There's one more MemStore-level timestamp that enables to complete multi-version concurrency control for MemStore reads, but that doesn't exists for key/values which are written to HFiles. Switching to per-column-family flush (an appealing feature for enhancing efficiency) would need a similar timestamp to be included to the extendable too.