Ryan Lambert: First Review of Partitioning OpenStreetMap

My previous two posts set the stage to evaluate declarative Postgres partitioning
for OpenStreetMap data.
This post outlines what I found when I tested my plan and outlines my next steps.
The goal with this series is to determine if partitioning
is a path worth going down, or if the additional complexity outweighs any benefits.
The first post on partitioning outlined my use case and why I thought
partitioning would be a potential benefit. The maintenance aspects of
partitioning are my #1 hope for improvement, with easy and fast loading and removal
of entire data sets being a big deal for me.

The second post
detailed my approach to partitioning
to allow me to partition based on date and region. In that post I
even bragged that a clever workaround was a suitable solution.

“No big deal, creating the osmp.pgosm_flex_partition table gives each osm_date + region a single ID to use to define list partitions.”
    — Arrogant Me

Read on to see where that assumption fell apart and my planned next steps.

I was hoping to have made a “Go / No-Go” decision by this point… I am currently
at a solid “Probably!”

Load data

For testing I simulated Colorado data being loaded once per
month on the 1st of each month and North America once per year on January 1.
This was conceptually easier to implement and test than trying to capture
exactly what I described in my initial post.
This approach resulted in 17 snapshots of OpenStreetMap being loaded,
15 with Colorado and two with North America.
I loaded this data twice, once using the planned partitioned setup and
the other using a simple stacked table to compare performance against.


Go to Source of this post
Author Of this post:
Title Of post: Ryan Lambert: First Review of Partitioning OpenStreetMap
Author Link: {authorlink}