This week, I learned about multiple version concurrency control and how it affects database design and operations.
Multi-version concurrency control, or MVCC, is a solution for a common database problem - given a database with many different users, how can a database handle several transactions at the same or different times while preserving data integrity and maintaining performance?
A database or system that uses MVCC can support two or more transactions at the same time. Because multiple users might modify or read data from this database at the same time, if these operations aren't handled correctly, this could lead to problems such as lost updates (two transactions happening at the same time, with one transaction overwriting the other), or phantom reads (reading data that does not reflect current or recently made changes to it, making it seem as if no changes were made at all).
One of the simplest ways to prevent problems like this from happening is locking rows or implementing isolation levels. Locking rows in a database prevents modifications to rows in a database until transactions performed to the locked rows are complete. In theory, this is safe and relatively easy to implement, but it could cause a performance hit if many transactions are required at once since these "queued" transactions must wait until other ones unlock.
Learning about MVCC has given me an interesting insight into how database systems in real world use handle thousands of transactions at once while preserving performance and data integrity as much as possible.
Multi-version concurrency control, or MVCC, is a solution for a common database problem - given a database with many different users, how can a database handle several transactions at the same or different times while preserving data integrity and maintaining performance?
A database or system that uses MVCC can support two or more transactions at the same time. Because multiple users might modify or read data from this database at the same time, if these operations aren't handled correctly, this could lead to problems such as lost updates (two transactions happening at the same time, with one transaction overwriting the other), or phantom reads (reading data that does not reflect current or recently made changes to it, making it seem as if no changes were made at all).
One of the simplest ways to prevent problems like this from happening is locking rows or implementing isolation levels. Locking rows in a database prevents modifications to rows in a database until transactions performed to the locked rows are complete. In theory, this is safe and relatively easy to implement, but it could cause a performance hit if many transactions are required at once since these "queued" transactions must wait until other ones unlock.
Learning about MVCC has given me an interesting insight into how database systems in real world use handle thousands of transactions at once while preserving performance and data integrity as much as possible.
Comments
Post a Comment