SN Pro Tips

View Original

Improving Performance on Older Instances with Table Rotation

If your instance has been around for a while, you've probably built up a few tables that are quite extensive. The good news is - there are steps that you can take to both mitigate, and even preemptively prevent this issue. 

"Table Rotations Groups" are a little-known feature that allows you to split very large tables into manageable chunks. There are two types of table rotation groups: Rotation, and Extension. What we're going to use today, is an extension type. This means that every so often, the table you're rotating will be extended, and all new data will go into the new extended table. 

There are specific use-cases for each table rotation group type. Extension is best for when a table is queried by the sys_created_on field often, but you need to retain all historical data. Rotation on the other hand, sets up a specified number of tables to rotate through using, for a specified period of time each - but once all tables have been used for the specified period of time, it goes back to the first table and overwrites it. This means that data in these tables is not permanent. There is an OOB "rotation"-type extension on the syslog table, for example. 

The benefits of this may not be obvious, but consider the following scenario: 

Say you want to query the table in question for all records created in the last 60 days.
Since the Created [sys_created_on] field is not typically indexed (being a date/time field), you essentially have to scan the entire table . If your instance is a few years old, and the table has been around for a long time with a lot of throughput, you might have a great number of records in this table! As you might imagine, this could easily result in query hell

See this content in the original post

By creating an "extension" type table rotation group, we are telling the database: "Every N days, extend the table in question and put all new records in the latest extension" (These extensions are called "shards"). The result of this, is that if you query by the Created [sys_created_on] field, it only has to scan the shards that contain the records from those dates. 

Imagine the above scenario again, except this time you've got an "extension" table rotation enabled so that every 90 days, the table is extended. If we imagine that the last shard was created 60 days ago, then a query for records created within the last 90 days would only need to scan the two most recent shards - far more efficient than scanning the entirety of all records for all time!

That said, there is a performance hit if you ever do need to scan a large portion of the table, or if you do a query without a sys_created_on query parameter. More on that below.

Should I do this in my instance? 

Maybe. It's good to ask this question and think carefully about it. It should certainly not be done on every table. Only a few are good candidates for table rotation, and many already have rotation enabled! For example, the ecc_queue table already has table rotation enabled! If you have a custom table that, for example, has a high-throughput, is used for tacit reporting, but data retention laws only require that you retain the records in it for two years... you might consider optimizing that table a little bit, by adding a table rotation group (Rotation type). 

Advantages

  • Completely transparent (for the most part) - once you set it, it's handled by the database and you don't need to think about it or do anything special to work with the shards.
  • Allows deletion of old data without affecting current data (using Rotation type).
  • Ensures tables only grow to a reasonable size.
  • Reduces working set of data when date is known for query (using either Rotation or Extension).


Disadvantages

Queries that do not use the table rotation date (Created [sys_created_on]), require an inefficient "join" query to query time ranges that span multiple tables and can be extremely slow if the number of sub-tables is large.
Which it will be.

For example, if you query by Updated [sys_updated] rather than Created [sys_created_on], you will not be able to make use of the "sharding", and will end up with a very inefficient union query that joins all the shards together before performing the query, since the database doesn't know for sure that only a few specific shards might contain the data you're looking for. 

To improve performance, it's therefore obviously best that the query includes a window of created dates. Otherwise, your queries would actually perform worse than if you didn't enable table rotation. For this reason, you should only enable table rotation on tables that you commonly query (either with scripts, or implicitly by loading something in the UI such as the history log) by created date. For example, this would not make sense for the Incident [incident] table, as most queries on that table do not contain a query parameter on the sys_created_on field. 

How do I do it? 

  1. Navigate to the Table Rotation Group [sys_table_rotation] table by entering "sys_table_rotation.list" into the Application Navigator filter bar, and pressing Enter
  2. To create a new Table Rotation, click New at the top of the Table Rotation [sys_table_rotation] list. 
  3. Set the Name field to the name of the table you'd like the rotation to apply to.. 
  4. Set the Duration field to an amount of time representing how long each table should live before another extension (AKA "shard") should be created. I recommend setting this field to somewhere between 90 days and 180 days, depending on how active the table is (more active = lower duration). I'm going to enter 120 days here. 
  5. Make sure that Type is set to Extension.
    1. If you're creating a Rotation type, set the Type field to that, and you'll have the option to enter a number of rotations before it begins overwriting old data. 

Once you've got that set up, simply save the record, and table extension will be enabled! 

Pro-Tip: Table Rotation records are not captured in update sets, so be sure to manually create them in production after testing in sub-prod environments. Also consider using our Include in Update Set tool if you want to move them from dev to higher instances!


See this content in the original post