In-Memory Column Store

http://www.oracle.com/technetwork/issue-archive/2014/14-sep/o54dba-2279197.html

The application users are upset with application’s performance when running large analytic queries.

Brief analysis shows that:

  • Increasing the size of the buffer cache has delivered little or no benefit because of the overhead of cache managing
  • Creation of the additional indexes to improve query performance will slow application’s OLTP functionality.
  • Developers can alter the application code to make it more efficient, but it is expensive and time consuming process.

There is a solution that requires neither application code changes nor creation of any new indexes. The solution is the in-memory column store, part of the Oracle Database In-Memory option, available for use with Oracle Database 12c Release 2 (12.1.0.2).

Tables whose entire set of rows, even for a few columns, are selected by queries are great candidates for the new in-memory column store. Tables or partitions that participate in OLTP transactions more than in queries will not benefit as much, so they are not candidates for the in-memory column store.

The documentation claims the in-memory column store is good for the following:

  • Large scans that apply “=”, “<“, “>” and “IN” filters.
  • Queries that return a small number of columns from a table with a large number of columns.
  • Queries that join small tables to large tables.
  • Queries that aggregate data.

It also states it is not designed for the following:

  • Queries with complex predicates.
  • Queries that return a large number of columns.
  • Queries that return large numbers of rows.
  • Queries with multiple large table joins.

Traditionally, Oracle Database has stored data in multicolumn records, also known as rows, one column after another. If a query selects all the columns of a table every time, this arrangement of data inside the database block will perform well.

However, the analytic queries typically select only a handful of columns from all the rows. The database fetches the entire database block into the buffer cache, locates the first row, identifies the start position of the column, extracts the value, locates the starting point of the next row, locates the relative position of the same columns in that row, extracts the value, and so on. The process repeats until the values of these selected columns have been extracted from all the rows.

If the database were able to read the column values from all the rows immediately instead of performing the additional tasks of locating the starting points of the rows and then the relative positions of the columns inside them, the performance would be fantastic.

That calls for a new design called a column store database, in which the columns – not the rows – are stored together. Columns from the rows are stored together in a new structure called a column unit (CU). The CU is not the same as a data block and can be much bigger than the typical 8 K data block. This special arrangement of columns enables the query to grab the values of the selected column from all the rows immediately—without going through the time-consuming task of identifying rows and the columns inside.

In Oracle Database 12c Release 2 (12.1.0.2), we have the option of using both types of stores in the same database. And the column store exists entirely in memory. The combination of the column store structure and memory residency makes the performance of analytical queries soar.

The in-memory column store is a new pool in the system global area (SGA). The DBA decides how much memory to allocate to this new pool and issues this statement to create the inmemory column store:

alter system set inmemory_size=48g scope=spfile;

This is the only configuration step required to enable the column store. This is not a dynamic operation; the database instance needs to be recycled to enable the new column store.

Because the size of this sample column store is only 48 GB, it may be smaller than the overall database size. With Oracle Database In-Memory, however, the DBA can choose specific data to populate the inmemory column store. Even better, the DBA can choose individual partitions of tables rather than entire tables. The DBA can also exclude some columns from the the in-memory table.

The basic syntax for putting tables in the in-memory column store:

alter table TableName inmemory;

The version of the table in the in-memory column store is in the new column store format for better performance. On disk, however, the table continues to be stored in the traditional row store format. When users modify the data in that table, the version of the table in the buffer cache is updated and written back to the disk. When the data is queried, however, the in-memory column store version of the table is used.

In addition, to save space, a DBA can compress the data in the in-memory column store, using the MEMCOMPRESS subclause of the INMEMORY attribute. This compression is inside the inmemory column store only; the segment on the disk is not affected. Higher priority store should be assigned to more-recent partitions to be loaded sooner and stay in memory longer.

Different compression levels can abe defined for different columns in the column store. For example, a table named T1 is in the in-memory column store, but its columns c4 and c5 should not be in the in-memory column store. In addition, columns c3 and c2 should be compressed with the CAPACITY LOW option whereas the other columns should be compressed with QUERY LOW (the default).

alter table t1 inmemory
memcompress for
capacity low (c2,c3)
no inmemory (c4,c5);

Oracle Database automatically keeps the in-memory column store transactionally consistent with the row store format in the buffer cache, similar to how an index is kept transactionally consistent with a table.

What if the total size of all the segments placed in the in-memory column store is more than 48 GB? Oracle Database will determine which segments are loaded in what sequence and how long they stay there, based on a priority that can be defined in the ALTER statement with the PRIORITY subclause of the INMEMORY attribute. The PRIORITY subclause takes four options: CRITICAL, HIGH, MEDIUM, and NONE. The last option does not populate the segment into the column store automatically. That populating occurs only when someone selects from the table.

In-memory column store works with the multiple nodes in an Oracle Real Application Clusters (OracleRAC) database and is distributed among the caches of all nodes. It also can be duplicated on all Oracle RAC nodes by use of another clause: DUPLICATE.

Oglasi
Ovaj unos je objavljen u Nekategorizirano. Bookmarkirajte stalnu vezu.

Komentiraj

Popunite niže tražene podatke ili kliknite na neku od ikona za prijavu:

WordPress.com Logo

Ovaj komentar pišete koristeći vaš WordPress.com račun. Odjava / Izmijeni )

Twitter picture

Ovaj komentar pišete koristeći vaš Twitter račun. Odjava / Izmijeni )

Facebook slika

Ovaj komentar pišete koristeći vaš Facebook račun. Odjava / Izmijeni )

Google+ photo

Ovaj komentar pišete koristeći vaš Google+ račun. Odjava / Izmijeni )

Spajanje na %s