New pgtools Release and Thoughts On DSLs
I've done another, relatively minor release of pgtools
(pgtools.sourceforge.net); besides some code cleanup, the main thing has
been to add a prototype "pg-make-get-function" script. This is what I
would like to discuss.
My first question, and probably the easier one: how important is it
to you, when using toolsets like this, that they don't have other
dependencies you need to install? Right now I'm doing everything in
Bourne shell and other standard Unix tools. However, this is starting
to get more and more hairy for two reasons: subtle (and not-so-subtle)
shell incompatabilities (between traditional sh, ksh and bash), and
just the lack of expressive power of the language itself. So for this
new pg-make-get-function script I wrote it in Ruby (www.ruby-lang.org),
which makes life a lot easier. I see three basic choices when moving
1. Do everything in Bourne shell and common utilities. The advantage
is compatability. The disadvantage is that it will greatly slow
development, bloat code size, and probably make some features
2. Do the basic stuff in Bourne shell and common utilities, and the
more advanced stuff in Ruby or similar. Probably this means moving
the SQL generation stuff to another language, and leaving a lot of
the other stuff in Bourne shell. This is the current case.
This lets those who want to use the simpler tools use them without
installing anything extra. However, it will slow development
somewhat, and can make things slightly sloppy on the interfaces
between the two. For example, one might generate a test, but one
can't just execute it directly and give good diagnostics; because
the testing system wants to execute a .sql file in psql and compare
output against a text file, so these need to be generated. It also
means that the parts written in Bourne shell won't be able to take
advantage of the abstractions and power of a real language.
3. Do everything in Ruby, or some other similarly powerful langauge.
Faster development, faster execution, more concise code, etc.
etc. With a language as suited to creating DSLs (Domain Specific
Languages) as Ruby, you wouldn't even have to do real Ruby
"programming" to take advanage of this, because you'd be using a
sort of mini-language that's specific to, say, testing or changing
database structures. The disadvantage is that you have to go out and
install that tool before you can use the system.
Personally, I'm heading towards option 3 right now, because in the
long run I think I do need that power. I'm in the midst of a database
refactoring right now that should be dead easy (rename four tables and
a bunch of columns), but it's much harder than it ought to be, and I'd
like to try to start automating some of this stuff.
All this has more than a little influence on my little
pg-make-get-function idea. This is an attempt to take a fairly common
idiom of mine and express it in one or two lines of code instead of
thirty or forty, every time.
Essentially, I have lookup tables I use where I don't care if I have to
insert a new entry or get an existing entry; all my code just wants an
existing entry to be returned, if available, or a new one to be created,
if that's necessary. So I write a function to which I hand candidate key
information, and it gets me back the rest of the info, something along
CREATE TABLE credit_card (
credit_card_id serial PRIMARY KEY,
account_number text NOT NULL,
expiry_date text NOT NULL,
UNIQUE (account_number, expiry_date),
created timestamptz NOT NULL DEFAULT
failed timestamptz NULL DEFAULT NULL
CREATE FUNCTION get_credit_card(account_number text, expiry_date)
(For those who wonder why I do this, here are a couple of reasons:
(1. It just makes life easier to have one way of getting an entry, without
having to worry about whether or not it exists.
(2. My web server's database login doesn't need access to card numbers,
so it doesn't have it; it can access the (SECURITY DEFINER) function
to get a card ID from a number/expiry date that it already knows
(because the user entered it), but cannot read a number and expiry date
that it doesn't already know. Thus, even if the web server account is
compromised, you can't use its privleges to get credit card information.)
So I wrote this little pg-make-get-function program, which takes a table
name and one or more column_name,type,test_value parameters, and works
as follows in my cat-schema script:
pgen_setseq credit_card_credit_card_id_seq 1000
pg-make-get-function credit_card \
It then creates the files function/get_credit_card.sql,
test/get_credit_card.sql, and test/get_credit_card.expected.
A bit kludgy? Perhaps more than a bit. Anybody got any ideas for
improvements? Here are some issues I have with it:
1. I hate generated files. The original source code is really the
pg-make-get-function statement; what do I do with the generated files?
Delete them after use? Commit them?
2. Reading the files. Reading the actual SQL output of stuff like this
can be useful for debugging. If I get rid of the files themselves, how
do I do that?
3. Modifications. This isn't very flexable at the moment. If I wanted
to add some interesting constraint to that function, e.g., I want it
to throw an exception if the credit_card exists and failed is NOT NULL
(indicating that the card failed last time I tried to use it and has not
yet been reset), how might I do that? Certainly modifying the generated
files isn't a good idea, since the modifications might get wiped out
next time the load script is run.
I'm open to all thoughts here.
Curt Sampson <cjs@...> +81 90 7737 2974
The power of accurate observation is commonly called cynicism
by those who have not got it. --George Bernard Shaw