Tie::DistHash -- a tie implementation facilitating a shared
cache.
use Tie::DistHash;
my(%hash);
my($cache) = tie(%hash,'Tie::DistHash',
ckptfile => 'mycache.dat',
gc_at_ckpt => 0);
$hash{session_key} = {arbitrary=>[$data,\%structure],including=>$objects};
...keys %hash...values %hash...
foreach my($key,$item_as_string) ( $cache->eachString ) {
print "$key likely has objects!\n"
if $item_as_string =~ m/bless\(/;
}
Tie::DistHash is a tie() implementation with some useful extras
for implementing a cache, like checkpointing, garbage collection, and
parsing of arbitrary data structures to strings that can be passed
to eval.
Tying a hash to this class will return an initialized DistHash
object. Use this object to run methods (see METHODS) and set
attributes (see ATTRIBUTES). Using the hash will change the
object.
Most of what is likely to get done with DistHash is as
straightforward as using a normal hash (once that hash has been
tied). Some things that are likely to get done fairly regularly,
though, are difficult to implement purely through a tie
interface. So it becomes a handy-dandy thing that tie returns an
object, which can have methods, etc, etc. Tying to DistHash returns
an object with the following methods:
set_timeout $timeout [,$key]get_timeout [$key]get_atime $key$key was accessed. If time is greater than
get_atime($key) and get_timeout($key) combined, the next garbage
collection will remove $key.
checkpoint [options]options is a hash that can override default
attributes affecting checkpoints. See ATTRIBUTES.
collect_garbageintegrity_checktoStringeval. Be
aware that it always returns the entire hash...
eachStringeach, but returns the next item as a string that can be passed
to eval. Two out of three dentists prefer this over toString.
init_from FHundumptoString. Try to use eachString and eval rather than
toString and <undump> for large hashes.
sync KEY$db{KEY} is
a ref to something (i.e., $db{KEY} = {name=``who'',phone=>5551234}>),
and something inside that something has changed (i.e.,
$db{KEY}{name}="whoelse"). In this case, the value of $db{KEY}
stays the same, and the tie implementation sees no only the access of
KEY, so only a TOUCH sync happens. After making a change like that,
$dbobj-sync(KEY)> would make sure that everyone in the sync pool
sees the change.
Most of the behavior of a hash tied to DistHash is controlled
by options. Options are usually set at initialization, when the object
is tied. Simply pass a hash to tie() after the first two
arguments. See SYNOPSIS.
sync => [ { host => "hostname1",
addr => "1.1.1.1:1234",
sync_to => 10
},
{ host => "hostname1",
port => 1234, addr => "1.1.1.1",
sync_to => 10
}
]
The contents of {addr} will overwrite those of {host} (and {port}, if that part of {addr} is supplied.)
One of the elements of this array will usually specify a local address, and there will usually be one or more remote specifications. If there is no local spec that can be bound to, we will only send syncs (see sync_checkint.)
sync_events => { ACCESS => 1, # to refresh access time for timeouts
DELETE => 1,
MODIFY => 1,
ADD => 1,
SETTO => 1, # when setting a timeout
CLEAR => 0, # eg, before assigning to the hash
DESTROY => 0, # end of tie
CKPT => 0, # at checkpoints
INTCK => 0, # at integrity checks
GC => 0 # at garbage collection
}
NOTE: CKPT, INTCK and GC are not yet implemented.
At startup, the DistHash tries to retrieve state via a full sync
from one of the servers in the pool. It will accept sync commands from
other servers while this full sync completes. If no server in the pool
is reachable, the CODEREF sync_state_retrieval_hook is run. If it
returns 0 or doesn't exist, then state is retrieved from the
checkpoint file. Only after a full state retrieval is complete will
the tie operation return.
When an event occurs that requires synchronization per sync_events, a command is sent to each server in the sync pool.
Note: Be aware that only events actually handled by the tie can
sync automatically. If a value is a reference, and something in what's
referenced changes, there's no way for a tie to know
automatically. See sync for a way to tell Tie::DistHash
manually.
For fetches and other ATIME-only updates, a touch is sent; no more
than one every sync_atime seconds. For data inserts, updates and
deletes, an appropriate command is sent, basically duplicating the
command received by the server performing the sync.
Sync commands are checked for every sync_checkint seconds. This
mechanism is built using SIGALRM, so if you use sync, be careful
when you set or modify $SIG{ALRM}!
Synchronization at this time is accomplished in the blind UDP way. How trustworthy this is depends on how trustworthy your network is. This also implies a limit to the size of a value before you get errors about too-large message sizes. Messages as large as 11K work fine, but this limitiation will hopefully be lifted in the future.