2008/09/08

The Labs.Com Web Lab MyDBase MyDB Cookbook
Last update 1999/08/16
The Labs - Design & Functionality For The Net

Tips & Tricks using MyDBase

Here some tips & tricks and implementation assistance using MyDBase.

  1. Design Hints
  2. CGI
  3. Case Study
  4. Mailing-Lists
  5. Case Study
  6. WebCounter
  7. Case Study
  8. WebChat
  9. Search Engine
MyDB Cookbook
1. Design Hints
The main advantage of MyDBase are hierachical tables. In SQL-based systems a table contains multiple entries of the same kind, in MyDB each table contains just unique fields with their associated data, to point out again:

MyDBase Concept:

fullnameWebmaster
emailwebmaster@here.com
age25
::

SQL Concept:

fullnameemailage
Webmasterwebmaster@here.com25
Rootroot@here.com32
Meme@here.com48
:::

If you like to design a structure for your tasks, consider following:

  1. Check system limitation of max files and directories in a directory, some system have 32'768, this limits the amount of tables within one table.
  2. MyDB only stores ASCII, don't try to store binary in a table you unlikely won't able to retrieve it. Solution, use pack(-u) (see Perl Manual) to convert binary to ASCII and then store it.
  3. How many programs will use the MyDB? Create for each program a separate user in ./passwd so you can distinct problems easier in the log-files.

MyDB Cookbook
2. CGI

We developed MyDBase actually with the main usage within several CGIs. Use the perl5 API with tie:

 use lib qw(path-where-the-TieMyDB.pm-is); 
  
 use TieMyDB; 
  
 $s = tie %db,'TieMyDB','localhost',undef,'mycgi','mycgipasswd'; 
 if(!$s) {  
    # --- connection failed 
 } 
  
 untie %db; 

$t = tie %handle,host,port [,user,passwd];
$handle{"table:field"} = data;
$t->dump("table",%h);
@_ = $t->query("table.*");
%h = $t->query("table.*");
@_ = $t->query("table:*");
%h = $t->query("table:*");
data = $handle{"table:field"};
delete $handle{"table"};
delete $handle{"table:field"};
untie %handle;

MyDB Cookbook
3. Case Study: Mailing-Lists

There are several things to consider:
  1. multiple lists
  2. each list has distribution-list
  3. archive of all postings

Design:

    Lists 
       list1-l 
          dist 
          archive 
             1998-05 
             1998-06 
             1998-07 
             1998-08 
             1998-09 
       list2-l 
          : 
       list3-l 
          : 

Let's look closely to the several tables:

 $t = tie %db,'TieMyDB','localhost',undef,$user,$passwd; 

  • list1-l:

    fullnameList One
    posts691

    To create a new list, just do

     $db{"Lists.list1-l:fullname"} = 'List One'; 

    This creates Lists.list-l and field fullname = "List One";

  • dist:

    sample2@me.comsubscribed
    sample5@me.comsubscribed
    sample8@me.comsubscribed
    sample10@me.comsubscribed

    To add new people, just do

     $db{"Lists.list1-l.dist:$email"} = 'subscribed'; 

    To remove

     delete $db{"Lists.list1-l.dist:$email"}; 

  • archive has sub-tables year-month in which all postings are numbered.

     ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = 
         localtime(time); 
     $date = sprintf("%04d-%02d",$year+1900,$mon+1); 
     $postno = $db{"Lists.list1-l:posts"}++; 
     $db{"Lists.list1-l.archive.$date.$postno:post"} = $mailtext; 

    When you query the archive, make sure you sort value-wise, not alphabetically.

     %postings = $t->query("Lists.list1-l.archive.1998-05.*"); 
     foreach $p (sort { $a-$b; } keys %postings) { 
        print "$p: ",$postings{"Lists.list1-l.archive.1998-05.$p:body"},"\n"; 
     } 

Bottleneck: There is a serious bottle-neck to consider, what happens when the application(s) handling the mailing-list cannot connect to the database? This is serious to look at. Make sure no mail is accepted, and the mail tries to reconnect (kill smptd or adjust /etc/inetd.conf temporarly).

Also, install a port sniffer scanning some ports (usually used to investigate security holes) and make sure those ports with important deamons like smtp, httpd and even mydb do accept connection, if one is missing page the system-admin. Monitor mydbd 24h non-stop as it is then truely the heart of your system.

MyDB has been tested for days under heavy load, ie 6000 connections per minutes for two days, but you never know, so check the worst cases anyway.

MyDB Cookbook
4. Case Study: WebCounter

Simple web-counters are many available, but only a few are really well programmed with client/server (mainly because CGI has to lock the counter-file in case you don't use cookies). With mydb you certainly can implement a stable and realiable web-counter easily.

 tie %db,'TieMyDB','localhost,undef,'default',yourpasswd
 $count = $db{"WebCounter.$cnt_name.count"}++; 
 untie %db; 

You may use GD.pm a simple package to create and manipulate GIF within perl to implement the graphical part of the counter. A sample implementation you can check at WebCounter.

Since a web-counter isn't really system critical, there is no concern in case mydb is down.

MyDB Cookbook
5. Case Study: WebChat

Since most web-application are mainly user-oriented, where multiple users execute or request information, if the information is static or read-only there is certainly no problem, but if the users desires to write or change something then file-locking and alike consideration make programming a hassle. For that purpose databases are designed. Now, such an application is also a web-chat, either as CGI or Java-applet realized:
  1. Client receives log-file (CGI) or most recent line (Java).
  2. Client can submit new line (CGI & Java)

Point 2) is critical when two clients connect the same moment. You can use UNIX file-locking by appending a line to a file, and you don't have to bother at all about if two or more clients mix their lines into the chat-log, but that's actually not a nice way to solve it.

So, let's look at this case-study. We might start a separate mydbd with a new port-number than a mydbd we may have running to handle all more critical stuff, I suggest this, as chats can become real busy and in case something goes wrong, not the main more system-critical mydbd goes down, so let's start a new one:

 % mydbd -port 4701 -name "Chat Server" & 

Try to start it within a new directory, so the log and other files are not shared by another mydbd.

The usual way to open connection:

 $MAX_LINES = 100; 
  
 tie %db,'TieMyDB','localhost',4701,'chat',yourpasswd
  
 $new_line = &getline_from_client(); 
 if(length($new_line)) { 
    $lineno = (++$db{"Chat.$room:lineno"}%$MAX_LINES); 
    $db{"Chat.$room.log:$lineno"} = $newline; 
 } 

CGI-basedJava-applet
Each CGI call connects to database, and retrieves the log-file, and if required add the new line into the log-file.

Certainly this way is efficient.

Each java-applet connects to database, and polls each 5 secs for the Chat.$room:lineno number to see if there is a new line, and if requires requests the new line(s) to display with the applet. Admitly polling is waste of bandwidth and not really real-time.

So in short, if you plan a CGI based webchat then you likely can use mydb very well, as Java thought it's not recommended and you better consider to write your chat-server from scratch to avoid polling.

MyDB Cookbook
6. Search Engine

Large sites (ie. more than 1000 pages) full-text searching is a hassle to run them from scratch over all pages, therefore db-based full-text searching is a good solution:
  • check-in/out documents from index-file
  • rescan only pages which are new or have changed
  • good for large mailing-list archives

We suggest a structure like this:

Theoretical Implementation: An indexer takes a file and extracts all words, .txt verbatim, .html tags are removed but content is indexed.

 # --- create unique id: 
 $db{"search.index.file.filename:id"} = ++$id;  
 $db{"search.index.id.id:filename"} = filename
 $db{"search.index.id.id:title"} = title
 $db{"search.index.id.id:intro"} = intro
  
 # --- adding new page to index 
 foreach $w (@words) { 
    wordprefix = substr($w,0,2); 
    word = $w; 
    $db{"search.words.wordprefix:word"} .= ",".$id; 
    unless($db{"search.index.id.id:refs"}=~/,worldprefix/) { 
       $db{"search.index.id.id:refs"} .= ",".wordprefix
    } 
 } 
  
 # --- removing page from index 
 $id = $db{"search.index.file.filename:id"}; 
 foreach $p (split(/,/,$db{"search.id.id:refs"})) { 
    foreach $w ($session->query("search.words.$p:*")) { 
       # --- contains the reference pointer to me? 
       if($db{search.words.$p:$w"}=~/,$id/) { 
          # --- removing the reference 
          $db{search.words.$p:$w"} =~ s/,$id//; 
       } 
    } 
 } 
  
 # --- searching in index 
 foreach $w (@words) { 
    wordprefix = substr($w,0,2); 
    word = $w; 
    push(@files,$db{"search.words.wordprefix:word"}); 
 } 
 # build logically and/or of id then ie. split(/,/ ...); 
 foreach $f (@filesresult) { 
    id = $f; 
    print "<dt><a href=\"",$db{"search.index.id.id:filename"},"\">"; 
    print $db{search.index.id.id:title"}; 
    print "</a>"; 
 } 

We will provide a search-engine using mydb and release source in this page.

                                                                                                                                   

MyDBase

Last update 1999/08/16

All Rights Reserved - (C) 1997 - 2008 by The Labs.Com

Top of Page

The Labs.Com