Sorry, an error occurred while loading the content.

## [boost] units library: A hash function approach type safe quantities

Expand Messages
• The following idea kept me awake last night. GOAL: Construct a system would allow an unlimited number of different units to be defined and still catch 99.9% of
Message 1 of 41 , Oct 1, 2001
The following idea kept me awake last night.

GOAL:

Construct a system would allow an unlimited number of different units
to be defined and still catch 99.9% of all type errors. It should
automatically generate new units when basic units are multiplied or
divided.

SUMMARY:

A units class is described that will have a single integer that
represents a "hash" of the unit involved. If two units (say foot and
lbs) are multiplied a hash of the product (foot_lbs) is created to act
as a signature of this new unit. If two units are added and their
hashes don't agree--the system complains at compile time. Type safety
is thus 99.999% assured. (there is a chance that two complex units
would have the same hash value and so mistakenly look the same.) The
hash function has the property that any way of generating foot_lbs
from other units will generate the same computed hash.

INTRODUCTION:

There are two problems the units library can solve: unit conversion
and type safety. If the first goal is totally dropped, I think we can
generate an elegant solution to the second problem. This would
suggest that we should have two libraries at the end of the day:
SIunits which does conversions (but ignores most type safety), and a
type safe library to be named latter (that does no unit conversions).

It seems that several of our candidate libraries look something like:

// feet
// | lbs
// | | seconds
// | | |
// v v v
//
typedef Unit< 0 > pure;
typedef Unit< 1 > foot;
typedef Unit< 0, 1 > pound;
typedef Unit< 0, 0, 1> second;
typedef Unit< 1, 1, 0> foot_lbs;
typedef Unit< 1, 0, -1> feet_per_second;

What makes this totally ugly is that if you want 10 different units,
you have 10 template integers. Not only is the code unreadable, but
to use 12 different units instead of 10 is almost impossible without
learning sed first!

Basically what is going on is that when two units are multiplied we
want to add a vector that represents the units involved. When two
units are divided, we subtract the vector of their units.
Mathematically this means we need a "ring" to represent the dimension
of our units. Currently all the systems that I've heard about use
basis elements--that leads to the vector space. But if instead we
didn't use basis elements, we wouldn't need such a large space to
represent each unit.

Each basic unit is given a hash value h(unit). Now if we multiply two
units

h(unit_A * unit_B) = h(unit_A) + h(unit_B)

if we divide two units:

h(unit_A * unit_B) = h(unit_A) - h(unit_B)

A complex transformation looks like:

h(unit_A^2 * unit_B / unit_C) = 2 * h(unit_A) + h(unit_B) - h(unit_C)

Using this scheme we can represent the previous units as:

typedef Unit< 0 > pure;
typedef Unit< h_foot > foot;
typedef Unit< h_pound > pound;
typedef Unit< h_second > second;
typedef Unit< h_foot+h_pound > foot_lbs;
typedef Unit< h_foot - h_second > feet_per_second;

where h_foot, h_pound and h_second are carefully chosen integers. I
laid it out so that it looks like the vector math we did above.

The cool thing about such a representation is that now we should be
able to replace the last two templates with the following:

Product_of_units<foot,pound> foot_lbs;
Ratio_of_units<foot,second> feet_per_second;

where

Product_of_units<unit_A,unit_B>

is interconvertable with a

Unit< h(unit_A) + h(unit_B) >

and similarly for Ratio_of_units. Thus our final code would look
something like:

typedef Unit< 0 > pure;
typedef Unit< h_foot > foot;
typedef Unit< h_pound > pound;
typedef Unit< h_second> second;
typedef Product_of_units<foot,pound> foot_lbs;
typedef Ratio_of_units<foot,second> feet_per_second;

Such a scheme would allow as many type safe doubles as desired.
Regardless of the number of types introduced almost all incorrect
assignments will be captured.

ISSUES AND PROBLEMS:

o It would be nice to use a proper hash function--say addition
modulo 2^32. This would allow larger hashes and better colosion
avoidance. Is there an easy way of doing modulo arithmetic in
templates? (I know they are Turing complete, but kinda slow!)

o Is there a way of automatically generating good hash values for the
basic units? Something close to random would be ideal. If we can't
do the modulo arithmetic they have to be kinda small though. (say
around 10-100 Million of a typical machine with 2^31 being the maximum
signed value.)

o I don't know how to do the conversions easilly between the
Product_of_units and the basic Units.

=============================================================================
Dean Foster dean@...
Statistics, Wharton, U. Penn 215 898 8233
Philadelphia PA 19104-6302 http://diskworld.wharton.upenn.edu
• ... Yes, but it makes sense in some contexts to take the sin of the dimensionless velocity (or 2 pi times it, anyway). Mass fraction I don t have a good
Message 41 of 41 , Oct 8, 2001
Kevin Lynch wrote:
> "George A. Heintzelman" wrote:
> > No. Amount<> is not dimensionless.
>
> I would also agree that it doesn't make sense to take sin(number of
> apples), but that isn't a good argument for dimensionality, I don't
> think: it makes no sense to take sin(mass fraction) in most cases, but
> "mass fraction" is a dimensionless quantity in both the SI definition
> and my different definition. Furthermore, in some systems, like the
> natural units of particle physics, some quantities are dimensionless
> (for example, velocity) that would in SI be dimensionful.

Yes, but it makes sense in some contexts to take the sin of the
dimensionless velocity (or 2 pi times it, anyway). Mass fraction I
don't have a good argument for. It semes like another category of the
clearly dimensionless stuff; it makes sense to multiply a dimensionful
quantity by a mass fraction or other ratio, and get another
smae-dimensioned quantity. I don't think this will usually hold for
number of apples.

> But I'm perfectly happy to accept the SI definition going forward from
> here, because I don't think dimensionful/dimensionless is going to be a
> useful distinction for building a unit framework for C++.
>
> > Angles, binomial coefficients, and such are examples of truly
> > dimensionless numbers.
>
> I think we need to be more careful. Binomial coefficients (and other
> constants such as the Bernoulli numbers, pi, e, etc...) are not only
> dimensionless (by any reasonable definition); they are pure numbers; no
> units can are attached. Angles may or may not be different.

Hrm. I think I see what you're saying here, but I'll have to think
about it a little more.

> > I still think something needs to address the
> > difference between an angle and other dimensionless units, but that's a
> > different question.
>
> I'll give it a shot. I'm not attached to this description, so if you
> have another approach feel free to convince me :-)
>
> "angular units" are not units in the sense that length and time units
> are "units"; I will go so far as to say that, in the language that I've
> been using, the names of "angular units" are just "tags" on pure
> numbers, not actual units. In the SI, we have the radian as an "angular
> uit", but it is a special case, and isn't like any other unit in the SI:
> "when one expresses the values of derived quantities involving plane
> angle or solid angle, it often aids understanding if the special names
> "steradian" (sr) are used in place of the number 1."
> http://physics.nist.gov/Pubs/SP811/sec04.html#4.3
>
> But, radian can't really be 1! It can't be, because that would imply
> that degree and grad are also 1.

No, it wouldn't. It would imply that 'degree' = pi/180, and 'grad' =
pi/200 (both dimensionless numbers). So I think this is all a
consistent picture, but I'm still not sure I'm entirely sure whether
this is the right way to encode this in a C++ library. Walter's 'Views'
in SIUnits or something similar might be a better way to deal with it.

> So, in summary, I am willing to accept Amount<> as dimensionful, and I
> think it is better to not call angular measure a unit, since it violates
> the algorithmic rules for units, but not the algorithm for tags. Or so
> I think now.

I think this is right, though I want to see how an actual
implementation plays out. Time to go to work! I'm going to see if I can
play a little with SIUnits and get something that does what we are
talking about integrated with it.

George Heintzelman
georgeh@...
Your message has been successfully submitted and would be delivered to recipients shortly.