"Brian Selzer" wrote in message
>
> "Authorised User" wrote in message
>
[snip]
>> So, having lots of tables is good from a data-purity point of view... but
>> spare a thought for the poor programmer who has to code for each table.
>> One general table is far from perfect, and we will lose of a great many
>> specific constraints that could help reduce data integrity problems...
>> BUT
>> we think we can run with it. In this particular case.
>
> It is a truly poor programmer indeed who would trade off data integrity
> for a faulty expectation of reduced work. If there are eighty types of
> atomic facts, then whether you're coding for one table or for eighty, you
> still have to code for eighty types of atomic facts. Is it really less
> work to lump it all together?
Unfortunately, given the piss-poor tools that programmers are content to
use--for a whole lot of reasons that I could rage about at length some other
time--the answer to your question isn't as clear as it should be. I will
give you one example from my own recent experience that illustrates the
point.
The client, the taxation authority of a small European country, had a
requirement to provide an application to its helpdesk users who handle calls
from the public about their personal tax returns. The application needed to
present the helpdesk users with a myriad details about each of the tax
allowances and credits each taxpayer might claim. At the time there were
about 120 different allowances for everything from mortgage interest relief
on investment properties to vet bills for guide dogs for the blind, and
everything in between. The descriptions of the benefits were wildly
different (as you might guess), and some required a variable number of
diverse supplimentary facts. (In terms of complexity this subsystem dwarfed
the rest of the tax system by an order of magnitude.)
To make this truly wild, the system also had to be able to respond to annual
changes in tax legislation, within a legally defined six week period. To
fail to implent an announced change within six weeks would require going
back to the legislature and getting them to literallly change the law of the
land. You can see where this is going.
Now normally I only get involved with performance issues at this site, but
having demonstrated how the database design can strongly influence the
performance of the applications, they invited me to review the proposed
solution, which was absolutely classic EAV. To cut a long story short I
told them, in terms, that they were embarking on a "massive blunder", and
had an absolutely bruising meeting with several customer representative and
also the prime contractor. It was 10 against 1, and my poor choice of words
ensured I had no allies in the room at all. But what sealed my fate was the
fact that my recommended solution (i.e. as many tables as it takes) just
wouldn't work in their environment. The DBMS could handle the requirement
in its sleep, and it probably would not have been any more difficult to code
a sufficiently versatile front-end application (though more about that
below). But what killed it dead was the middleware, which was some
extremely over-mature version of Tuxedo that couldn't be replaced for a
whole lot of very good (expensive) reasons. It turned out that this rev of
Tuxedo could support only 128 distinct services, of which something like 100
were already in use. (I may not have the details right, but the point is
Tuxedo imposed severe constraints.) Within their existing architecture my
solution would have required n-hundred additional services, where n would
grow each year.
My proposed solution to that problem was a single dynamic service. The
database would have its hundreds of tables, and there would be a single
service that would use dynamic SQL to query/update them. It would convert
the data to EAV for transit, but it would be done automatically using the
fully-documented, standard meta-data from the DBMS, and the process could
be revered at the client end to recover the original tables. Elegant, I
thought.
But no. This was unfamiliar territory for the programmers, who necessarily
always outnumber the database wonks (precisely because they are so
unproductive and therefore you need lots of them). There was just no
persuading them that dynamic SQL is respectable, and enshrined in the
ANSI/ISO standards. To them it was not a basic skill that they ought to
have mastered in school, nor even a skill to be acquired within a one-day
on-the-job training course. It was impossibly esoteric, risky, obscure, and
just not going to happen. So not only would they not build a generic,
dynamic Tuxedo service driven by the DBMS's own meta-data that they get for
free, they would not build a client application that used the same meta-data
to recover the tables from the EAV stream. To them, it was less risky to
re-invent that meta-data their own way and store the all data in a single
bloated EAV table in the database and build a single generic service that
way.
In case anyone is wondering how it turned out, well, today it stumbles
along. It has so-far appeared to meet the need for rapid change, at least
in the sense that no one can tell if it doesn't work. What is not so clear
is whether anyone has checked how reliable it is, and what is certain is
that even simple queries are insanely complex; there are no integrity
constraints of any kind, and no way to verify that every application is
treating the meta-data consistently. Unfortunately, although this is the
most egregious waste of expensive DBMS technology in the system, it is also
fairly typical. Expections are very low both for run-time performance and
for programmer productivity. This subsystem did not stand out as being
particularly bad by the local standards.
So, if programmers are content to re-invent the wheel, and if they regard
dynamic SQL as being too esoteric (or worse, they simply don't know about it
and don't know the DBMS already provides more and better meta-data than they
can devise--plus the logic to support it), then to them it could well look
like EAV is less work. They are imagining they have to code 200 distinct
services by hand to support 200 distinct tables. To them, the math shows
EAV is cheaper.
(Note that I've written about dynamic SQL here. There are other tools
available and one could imagine even better ones bing invented. I am not
suggesting dynamic SQL is the only solution. I am just pointing out that at
least one pretty good solution exists and one possible explanation why no
one uses it.)
Roy
>> Stay informed about: Attribute-values in separate table