 2008/09/08
|
Last update 1999/08/16
The Labs - Design & Functionality For The NetTips & Tricks using MyDBase
Here some tips & tricks and implementation assistance using
MyDBase.
- Design Hints
- CGI
- Case Study
- Mailing-Lists
- Case Study
- WebCounter
- Case Study
- WebChat
- Search Engine
| MyDB Cookbook1. Design Hints
|
The main advantage of MyDBase are hierachical tables. In SQL-based
systems a table contains multiple entries of the same kind, in MyDB
each table contains just unique fields with their associated data, to point
out again:
MyDBase Concept:
| fullname | Webmaster |
| email | webmaster@here.com |
| age | 25 |
| : | : |
|
SQL Concept:
| fullname | email | age |
| Webmaster | webmaster@here.com | 25 |
| Root | root@here.com | 32 |
| Me | me@here.com | 48 |
| : | : | : |
|
If you like to design a structure for your tasks, consider following:
- Check system limitation of max files and directories in a directory,
some system have 32'768, this limits the amount of tables within one table.
- MyDB only stores ASCII, don't try to store binary in a table
you unlikely won't able to retrieve it. Solution, use pack(-u) (see Perl Manual) to
convert binary to ASCII and then store it.
- How many programs will use the MyDB? Create for each program
a separate user in ./passwd so you can distinct problems easier
in the log-files.
We developed MyDBase actually with the main usage within
several CGIs. Use the perl5 API with tie:
|
use lib qw(path-where-the-TieMyDB.pm-is);
|
|
|
|
use TieMyDB;
|
|
|
|
$s = tie %db,'TieMyDB','localhost',undef,'mycgi','mycgipasswd';
|
|
if(!$s) {
|
|
# --- connection failed
|
|
}
|
|
|
|
untie %db;
|
| $t = tie %handle,host,port [,user,passwd]; |
| $handle{"table:field"} = data; |
| $t->dump("table",%h); |
| @_ = $t->query("table.*");%h = $t->query("table.*");@_ = $t->query("table:*");%h = $t->query("table:*"); |
| data = $handle{"table:field"}; |
| delete $handle{"table"}; |
| delete $handle{"table:field"}; |
| untie %handle; |
|
| MyDB Cookbook3. Case Study: Mailing-Lists
|
There are several things to consider:
- multiple lists
- each list has distribution-list
- archive of all postings
Design:
|
Lists
|
|
list1-l
|
|
dist
|
|
archive
|
|
1998-05
|
|
1998-06
|
|
1998-07
|
|
1998-08
|
|
1998-09
|
|
list2-l
|
|
:
|
|
list3-l
|
|
:
|
Let's look closely to the several tables:
|
$t = tie %db,'TieMyDB','localhost',undef,$user,$passwd;
|
- list1-l:
| fullname | List One |
| posts | 691 |
|
To create a new list, just do
|
$db{"Lists.list1-l:fullname"} = 'List One';
|
This creates Lists.list-l and field fullname = "List One";
- dist:
| sample2@me.com | subscribed |
| sample5@me.com | subscribed |
| sample8@me.com | subscribed |
| sample10@me.com | subscribed |
|
To add new people, just do
|
$db{"Lists.list1-l.dist:$email"} = 'subscribed';
|
To remove
|
delete $db{"Lists.list1-l.dist:$email"};
|
- archive has sub-tables year-month in which
all postings are numbered.
|
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
|
|
localtime(time);
|
|
$date = sprintf("%04d-%02d",$year+1900,$mon+1);
|
|
$postno = $db{"Lists.list1-l:posts"}++;
|
|
$db{"Lists.list1-l.archive.$date.$postno:post"} = $mailtext;
|
When you query the archive,
make sure you sort value-wise, not alphabetically.
|
%postings = $t->query("Lists.list1-l.archive.1998-05.*");
|
|
foreach $p (sort { $a-$b; } keys %postings) {
|
|
print "$p: ",$postings{"Lists.list1-l.archive.1998-05.$p:body"},"\n";
|
|
}
|
Bottleneck: There is a serious bottle-neck to consider,
what happens when the application(s) handling the mailing-list
cannot connect to the database? This is serious to look at.
Make sure no mail is accepted, and the mail tries to
reconnect (kill smptd or adjust /etc/inetd.conf temporarly).
Also, install a port sniffer scanning some ports
(usually used to investigate security holes) and make sure
those ports with important deamons like smtp, httpd and
even mydb do accept connection, if one is missing page
the system-admin. Monitor mydbd 24h non-stop as it
is then truely the heart of your system.
MyDB has been tested for days under heavy load, ie
6000 connections per minutes for two days, but you never
know, so check the worst cases anyway.
| MyDB Cookbook4. Case Study: WebCounter
|
Simple web-counters are many available, but only a few are really well
programmed with client/server (mainly because CGI has to lock
the counter-file in case you don't use cookies). With mydb you certainly
can implement a stable and realiable web-counter easily.
|
tie %db,'TieMyDB','localhost,undef,'default',yourpasswd;
|
|
$count = $db{"WebCounter.$cnt_name.count"}++;
|
|
untie %db;
|
You may use GD.pm a simple package to create and
manipulate GIF within perl to implement the graphical part of
the counter. A sample implementation you can check at WebCounter.
Since a web-counter isn't really system critical, there is
no concern in case mydb is down.
| MyDB Cookbook5. Case Study: WebChat
|
Since most web-application are mainly user-oriented, where multiple
users execute or request information, if the information is static or
read-only there is certainly no problem, but if the users desires to
write or change something then file-locking and alike consideration
make programming a hassle. For that purpose databases are designed.
Now, such an application is also a web-chat, either as CGI or Java-applet
realized:
- Client receives log-file (CGI) or most recent line (Java).
- Client can submit new line (CGI & Java)
Point 2) is critical when two clients connect the same moment. You can
use UNIX file-locking by appending a line to a file, and you don't
have to bother at all about if two or more clients mix their lines into
the chat-log, but that's actually not a nice way to solve it.
So, let's look at this case-study. We might start a separate mydbd
with a new port-number than a mydbd we may have running to handle
all more critical stuff, I suggest this, as chats can become real busy
and in case something goes wrong, not the main more system-critical
mydbd goes down, so let's start a new one:
|
% mydbd -port 4701 -name "Chat Server" &
|
Try to start it within a new directory, so the log and other
files are not shared by another mydbd.
The usual way to open connection:
|
$MAX_LINES = 100;
|
|
|
|
tie %db,'TieMyDB','localhost',4701,'chat',yourpasswd;
|
|
|
|
$new_line = &getline_from_client();
|
|
if(length($new_line)) {
|
|
$lineno = (++$db{"Chat.$room:lineno"}%$MAX_LINES);
|
|
$db{"Chat.$room.log:$lineno"} = $newline;
|
|
}
|
| CGI-based | Java-applet |
| Each CGI call connects to database, and retrieves the log-file, and if required add the new line into the log-file. Certainly this way is efficient. | Each java-applet connects to database, and polls each 5 secs for the Chat.$room:lineno number to see if there is a new line, and if requires requests the new line(s) to display with the applet. Admitly polling is waste of bandwidth and not really real-time. |
|
So in short, if you plan a CGI based webchat then you likely can use mydb very well, as Java thought it's not recommended and
you better consider to write your chat-server from scratch to avoid polling.
| MyDB Cookbook6. Search Engine
|
Large sites (ie. more than 1000 pages) full-text searching is a hassle
to run them from scratch over all pages, therefore db-based full-text searching
is a good solution:
- check-in/out documents from index-file
- rescan only pages which are new or have changed
- good for large mailing-list archives
We suggest a structure like this:
Theoretical Implementation: An indexer takes a file and extracts all words, .txt verbatim, .html tags
are removed but content is indexed.
|
# --- create unique id:
|
|
$db{"search.index.file.filename:id"} = ++$id;
|
|
$db{"search.index.id.id:filename"} = filename;
|
|
$db{"search.index.id.id:title"} = title;
|
|
$db{"search.index.id.id:intro"} = intro;
|
|
|
|
# --- adding new page to index
|
|
foreach $w (@words) {
|
|
wordprefix = substr($w,0,2);
|
|
word = $w;
|
|
$db{"search.words.wordprefix:word"} .= ",".$id;
|
|
unless($db{"search.index.id.id:refs"}=~/,worldprefix/) {
|
|
$db{"search.index.id.id:refs"} .= ",".wordprefix;
|
|
}
|
|
}
|
|
|
|
# --- removing page from index
|
|
$id = $db{"search.index.file.filename:id"};
|
|
foreach $p (split(/,/,$db{"search.id.id:refs"})) {
|
|
foreach $w ($session->query("search.words.$p:*")) {
|
|
# --- contains the reference pointer to me?
|
|
if($db{search.words.$p:$w"}=~/,$id/) {
|
|
# --- removing the reference
|
|
$db{search.words.$p:$w"} =~ s/,$id//;
|
|
}
|
|
}
|
|
}
|
|
|
|
# --- searching in index
|
|
foreach $w (@words) {
|
|
wordprefix = substr($w,0,2);
|
|
word = $w;
|
|
push(@files,$db{"search.words.wordprefix:word"});
|
|
}
|
|
# build logically and/or of id then ie. split(/,/ ...);
|
|
foreach $f (@filesresult) {
|
|
id = $f;
|
|
print "<dt><a href=\"",$db{"search.index.id.id:filename"},"\">";
|
|
print $db{search.index.id.id:title"};
|
|
print "</a>";
|
|
}
|
We will provide a search-engine using mydb and release source in
this page.

Last update 1999/08/16 
All Rights Reserved - (C) 1997 - 2008 by The Labs.Com |