Since the topic comes up periodically, and actual use of splitline
will require recognising discrete shapes, I thought I would have
another go at a definition.
A Shape is a simple polygon and its enclosed area.
A Block consists of one or more non-intersecting Shapes with an
associated population.
Blocks should ideally consist of a single convex shape.
A Region consists of one or more non-intersecting Blocks. Any holes
in a Region should be filled with a zero population Block.
Each contiguous group of Blocks within a Region shall be considered a
Sub-Region.
The Region shall be split into two Semi-Regions.
D is defined as the number of districts.
M is equal to the floor of D/2.
N is equal to D - M.
The target population is equal to the total population divided by D.
A Cut of a Region shall be defined by two points.
A Cut shall be considered a Valid Cut if
- the two points are on the boundary of a Sub-Region
- the line segment connecting the two points does not leave the Sub-Region
- if all Blocks that intercept the line segment are removed, the
remaining Blocks of the Sub-Region form 2 contiguous Areas
- the total population of Blocks on one side of the line is less than
N time the target population and less than M times the target
population on the other side
When determining which side of the line a Block falls, the following
rules shall be used:
- All Blocks in Sub-Regions that fall entirely on one side of the line
shall be considered to fall on that side of the line
- All Blocks in Sub-Regions that intercept the line shall be undefined
- All Blocks in the Sub-Region containing the line segment shall be
considered to fall on whichever side of the line segment the
contiguous area that contains the block falls on.
- All Blocks that intercept the line segment shall be undefined
To split a Region, find the shortest Valid Cut for that Region.
All Blocks which fall on a defined side of the line shall be placed in
that Semi-Region.
A placement of the undefined Blocks between the Semi-Regions shall be
a Valid Placement if
- The Blocks in the Sub-Region containing the line segment shall be
split into 2 contiguous areas
- For each of the other Sub-Regions, all its Blocks are placed in one
or other of the Semi-Regions
The Valid Placement that has the lowest population imbalance[**] shall
be the placement selected. Where there is a tie, find the most
Southern* undefined block that is placed differently between the
placements. The placement that places that block in the Easternmost*
Semi-Region shall be considered to win the tie.
[*] requires more detail for absolute tie breaking
[**] something like imbalance = [population(N_Semi-Region) / N] -
[population(M_Semi-Region) / M]
Recurse use D = N and D = M for the 2 Semi-Regions, until Regions of 1
district are reached.