Personal Home Page of Edmund HornerSubversion hacking

A quick-and dirty MySQL backend

The aim is to alter libsvn_fs (and much smaller parts of svnadmin, and libsvn_repos) to support storing filesystem data in a MySQL database. Configuration data will be stored in a repository directory, along with the existing non-BDB repository stuff.

The practical reasons for such a scheme may possibly include increased efficiency, integration with existing SQL databases, another interface to the data (using simple SQL statements), and buzzword points. Since, AFAIK, this hasn't been done before, I'm hoping that the one practical outcome of this experiment will be to provide exciting new data on the above.

Status

"svnadmin create" accepts the arguments --mysql-host=HOST[:PORT], --mysql-username=USERNAME, --mysql-password=PASSWORD, and --mysql-database=DATABASE. If --mysql-host is given, then HOST[:PORT], USERNAME, PASSWORD and DATABASE are stored in the file db/mysql, and the database DATABASE on the MySQL server is populated with an empty repository (DATABASE must already exist with no tables). ("svnadmin create" will continue to create official BDB repositories if --mysql-host is not given.)

All access to such a repository will use the MySQL database for storing and retrieving data. This appears to work quite well, although I have not run the tests or done any extra concurrency testing (I have concerns about how transactions will work with MySQL).

The MySQL repository runs much slower than a BDB one. Here are some quick benchmarks (in minutes:seconds) for loading a 600-revision, 164-megabyte dump file:

MySQL InnoDB MySQL MYISAM BDB (default) BDB (with --bdb-txn-nosync)
Load repository 84:44 35:02 37:16 9:19

I plan to experiment with alternative methods of storing the data, with the aim of improving the speed of access. For example, I believe replacing the "next-id" records and base-36 keys with MySQL AUTO_INCREMENT fields will improve things significantly.

Subversion database schema

Summary of changes

Here's what I did in approximate order:

  1. Copied libsvn_fs/bdb/ to libsvn_fs/mysql/, and prefixed the file names with "mysql_" (also standardised on underscores instead of hyphens in the file names).
  2. Went through the new files, and replaced BDB code with various MySQL-compatible SQL statements. The only changes to the schema were the exploding of skels into table columns. List skels have required the addition of tables representation_windows (containing WINDOW records from representations), transaction_copies (COPIES from transactions), and transaction_props (PROPLIST from transactions).
  3. Moved the remaining BDB-specific stuff out of fs.c into bdb/bdb_fs.c. Also created mysql/mysql_fs.c along the same lines. The functions in bdb/ and mysql/ have been renamed to use simpler names, prefixed with "bdb_" or "mysql_" as appropriate.
  4. Created the svn_fs_bdb_t and svn_fs_mysql_t types, and modified svn_fs_t: (details)
    /*** The filesystem structure.  ***/
    
    typedef struct
    {
      /* A Berkeley DB environment for all the filesystem's databases.
         This establishes the scope of the filesystem's transactions.  */
      DB_ENV *env;
    
      /* The filesystem's various tables.  See `structure' for details.  */
      DB *changes;
      DB *copies;
      DB *nodes;
      DB *representations;
      DB *revisions;
      DB *strings;
      DB *transactions;
      DB *uuids;
    
    } svn_fs_bdb_t;
    
    typedef struct
    {
      /* MySQL connection. */
      MYSQL *conn;
    
      /* Connection settings for the server. */
      char *host;
      int port;
      char *username;
      char *password;
      char *database;
    
    } svn_fs_mysql_t;
    
    /* Forward declaration of the svn_fs_db_ops_t type, defined below. */
    struct svn_fs_db_ops_t;
    
    struct svn_fs_t
    {
      /* A pool managing this filesystem.  Freeing this pool must
         completely clean up the filesystem, including any database
         or system resources it holds.  */
      apr_pool_t *pool;
    
      /* The path to the repository's top-level directory. */
      char *path;
    
      /* DB-specific data. */
      union
      {
        svn_fs_bdb_t bdb;
        svn_fs_mysql_t mysql;
      };
    
      /* Virtual table of DB operations. */
      struct svn_fs_db_vtable_t *ops;
    
      /* A boolean for tracking when we have a live
         transaction trail alive. */
      svn_boolean_t in_txn_trail;
    
      /* A callback function for printing warning messages, and a baton to
         pass through to it.  */
      svn_fs_warning_callback_t warning;
      void *warning_baton;
    
      /* The filesystem configuration. */
      apr_hash_t *config;
    
      /* The filesystem UUID (or NULL if not-yet-known; see svn_fs_get_uuid). */
      const char *uuid;
    } svn_fs_t;
  5. Created the svn_fs_db_ops_t type: (details)
    typedef struct svn_fs_db_ops_t
    {
      /* General DB operations */
    
      int (* is_open) (struct svn_fs_t *fs);
    
      svn_error_t *(* create) (struct svn_fs_t *fs, const char *path);
    
      svn_error_t *(* open) (struct svn_fs_t *fs, const char *path);
    
      svn_error_t *(* close) (struct svn_fs_t *fs);
    
      /* Trail operations */
    
      svn_error_t *(* begin_trail) (trail_t *trail);
      svn_error_t *(* abort_trail) (trail_t *trail);
      svn_error_t *(* commit_trail) (trail_t *trail);
    
      /* 'changes' table operations */
    
      svn_error_t *(* changes_add) (struct svn_fs_t *fs, const char *key,
                                    svn_fs__change_t *change,
                                    struct trail_t *trail);
    
      svn_error_t *(* changes_delete) (struct svn_fs_t *fs, const char *key,
                                       struct trail_t *trail);
    
      svn_error_t *(* changes_fetch) (apr_hash_t **changes_p, struct svn_fs_t *fs,
                                      const char *key, struct trail_t *trail);
    
      svn_error_t *(* changes_fetch_raw) (apr_array_header_t **changes_p,
                                          struct svn_fs_t *fs, const char *key,
                                          struct trail_t *trail);
    
      /* 'copies' table operations */
    
      svn_error_t *(* reserve_copy_id) (const char **copy_id_p, svn_fs_t *fs,
                                        trail_t *trail);
    
      svn_error_t *(* create_copy) (svn_fs_t *fs, const char *copy_id,
                                    const char *src_path, const char *src_txn_id,
                                    const svn_fs_id_t *dst_noderev_id,
                                    svn_fs__copy_kind_t kind, trail_t *trail);
    
      svn_error_t *(* delete_copy) (svn_fs_t *fs, const char *copy_id,
                                    trail_t *trail);
    
      svn_error_t *(* get_copy) (svn_fs__copy_t **copy_p, svn_fs_t *fs,
                                 const char *copy_id, trail_t *trail);
    
      /* 'nodes' table operations */
    
      svn_error_t *(* new_node_id) (svn_fs_id_t **id_p, svn_fs_t *fs,
                                    const char *copy_id, const char *txn_id,
                                    trail_t *trail);
    
      svn_error_t *(* delete_nodes_entry) (svn_fs_t *fs, const svn_fs_id_t *id,
                                           trail_t *trail);
    
      svn_error_t *(* new_successor_id) (svn_fs_id_t **successor_p, svn_fs_t *fs,
                                         const svn_fs_id_t *id,
                                         const char *copy_id, const char *txn_id,
                                         trail_t *trail);
    
      svn_error_t *(* get_node_revision) (svn_fs__node_revision_t **noderev_p,
                                          svn_fs_t *fs, const svn_fs_id_t *id,
                                          trail_t *trail);
    
      svn_error_t *(* put_node_revision) (svn_fs_t *fs, const svn_fs_id_t *id,
                                          svn_fs__node_revision_t *noderev,
                                          trail_t *trail);
    
      /* 'representations' table operations */
    
      svn_error_t *(* read_rep) (svn_fs__representation_t **rep_p,
                                 svn_fs_t *fs, const char *key, trail_t *trail);
    
      svn_error_t *(* write_rep) (svn_fs_t *fs, const char *key,
                                  const svn_fs__representation_t *rep,
                                  trail_t *trail);
    
      svn_error_t *(* write_new_rep) (const char **key, svn_fs_t *fs,
                                      const svn_fs__representation_t *rep,
                                      trail_t *trail);
    
      svn_error_t *(* delete_rep) (svn_fs_t *fs, const char *key, trail_t *trail);
    
      /* 'revisions' table operations */
    
      svn_error_t *(* get_rev) (svn_fs__revision_t **revision_p, svn_fs_t *fs,
                                svn_revnum_t rev, trail_t *trail);
    
      svn_error_t *(* put_rev) (svn_revnum_t *rev, svn_fs_t *fs,
                                const svn_fs__revision_t *revision,
                                trail_t *trail);
    
      svn_error_t *(* youngest_rev) (svn_revnum_t *youngest_p, svn_fs_t *fs,
                                     trail_t *trail);
    
      /* 'strings' table operations */
    
      svn_error_t *(* string_read) (svn_fs_t *fs, const char *key, char *buf,
                                    svn_filesize_t offset, apr_size_t *len,
                                    trail_t *trail);
    
      svn_error_t *(* string_size) (svn_filesize_t *size, svn_fs_t *fs,
                                    const char *key, trail_t *trail);
    
      svn_error_t *(* string_append) (svn_fs_t *fs, const char **key,
                                      apr_size_t len, const char *buf,
                                      trail_t *trail);
    
      svn_error_t *(* string_clear) (svn_fs_t *fs, const char *key,
                                     trail_t *trail);
    
      svn_error_t *(* string_delete) (svn_fs_t *fs, const char *key,
                                      trail_t *trail);
    
      svn_error_t *(* string_copy) (svn_fs_t *fs, const char **new_key,
                                    const char *key, trail_t *trail);
    
      /* 'transactions' table operations */
    
      svn_error_t *(* create_txn) (const char **txn_name_p, svn_fs_t *fs,
                                   const svn_fs_id_t *root_id, trail_t *trail);
    
      svn_error_t *(* delete_txn) (svn_fs_t *fs, const char *txn_name,
                                   trail_t *trail);
    
      svn_error_t *(* get_txn) (svn_fs__transaction_t **txn_p, svn_fs_t *fs,
                                const char *txn_name, trail_t *trail);
    
      svn_error_t *(* put_txn) (svn_fs_t *fs, const svn_fs__transaction_t *txn,
                                const char *txn_name, trail_t *trail);
    
      svn_error_t *(* get_txn_list) (apr_array_header_t **names_p, svn_fs_t *fs,
                                     apr_pool_t *pool, trail_t *trail);
    
      /* 'uuids' table operations */
    
      svn_error_t *(* get_uuid) (svn_fs_t *fs, int idx, const char **uuid,
                                 trail_t *trail);
    
      svn_error_t *(* set_uuid) (svn_fs_t *fs, int idx, const char *uuid,
                                 trail_t *trail);
    
    } svn_fs_db_ops_t;
  6. Changed svnadmin/main.c and libsvn_repos/repos.c to take MySQL-specific arguments for repository creation, and to store those arguments in the repository directory.

Source

As I have not deleted or moved any files, the easiest way to distribute source is a patch against trunk (revision 8926):

Copyright (C) 2004, Edmund Horner.     $Id: index.html 2412 2007-01-23 03:08:21Z Edmund $