Row-Oriented vs Column-Oriented Databases

2023-10-21

Row-Oriented Databases:

In a row-oriented database, data is stored and retrieved by rows. All values for a particular record or tuple are stored together. Here's a simplified example of how data might be stored in a row-oriented database like MariaDB:

| EmployeeID | FirstName | LastName | Department | Salary |
|-------------|-----------|----------|------------|--------|
| 1           | John      | Doe      | IT         | 60000  |
| 2           | Jane      | Smith    | HR         | 50000  |
| 3           | Bob       | Johnson  | Finance    | 75000  |

On disk, the data for each row is stored together, like:

[1, John, Doe, IT, 60000]
[2, Jane, Smith, HR, 50000]
[3, Bob, Johnson, Finance, 75000]

Row-Oriented Storage Engine (e.g., InnoDB) Details:

  • How it Works:

    • Rows are stored sequentially on disk.
    • Well-suited for transactional workloads with frequent inserts, updates, and deletes.
    • Efficient for retrieving entire records.
  • Pros:

    • Suitable for OLTP (Online Transaction Processing) workloads.
    • Better for transactional consistency and integrity.
    • Good for point queries and updates.
  • Cons:

    • May be less efficient for analytical queries involving aggregations or calculations on large datasets.
    • Indexing and searching can be slower for certain types of queries.

Column-Oriented Databases:

In a column-oriented database, data is stored and retrieved by columns. All values for a particular column are stored together. Using the same "Employees" table, the data might be stored as:

| EmployeeID | 1 | 2 | 3 |
| FirstName  | John | Jane | Bob |
| LastName   | Doe | Smith | Johnson |
| Department | IT | HR | Finance |
| Salary     | 60000 | 50000 | 75000 |

On disk, the storage might look like:

[EmployeeID: 1, 2, 3]
[FirstName: John, Jane, Bob]
[LastName: Doe, Smith, Johnson]
[Department: IT, HR, Finance]
[Salary: 60000, 50000, 75000]

Column-Oriented Storage Engine (e.g., ColumnStore) Details:

  • How it Works:

    • Columns are stored sequentially on disk.
    • Well-suited for analytical workloads with queries involving aggregations or calculations on large datasets.
    • Efficient for retrieving specific columns.
  • Pros:

    • Suitable for OLAP (Online Analytical Processing) workloads.
    • Better for data warehousing and business intelligence applications.
    • Faster for analytical queries involving large datasets.
  • Cons:

    • May be less efficient for transactional workloads with frequent inserts, updates, and deletes.
    • Retrieving entire records can be less efficient.

Disk Storage Considerations:

  • Row-oriented databases are suitable for transactional workloads.
  • Column-oriented databases are well-suited for analytical queries and reporting.

Creating Tables in MariaDB:

Creating a Row-Oriented Table (InnoDB):

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    Department VARCHAR(50),
    Salary INT
) ENGINE=InnoDB;

Creating a Column-Oriented Table (ColumnStore):

CREATE TABLE SalaryInfo (
    EmployeeID INT,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    Department VARCHAR(50),
    Salary INT
) ENGINE=ColumnStore;

In these examples, the ENGINE clause specifies the storage engine for each table.

Keep in mind that the availability of storage engines and their features can depend on the specific version of MariaDB you are using, and you may need to install additional plugins or configure your database accordingly.