Loading ...
Sorry, an error occurred while loading the content.

Would like to ask the best database/table to achieve my goals

Expand Messages
  • ehaerim
    I would like to start over my question again with detail description of my goals to achieve. There are 3000 symbols and each symbol creates 3 blobs data
    Message 1 of 1 , Dec 18, 2011
    View Source
    • 0 Attachment
      I would like to start over my question again with detail description of my goals to achieve.


      There are 3000 symbols and each symbol creates 3 blobs data everyday.
      Symbols are named for example, S0000001 - S0003000.
      Blobs are named as TB, MB, and DB. For example, tb000001_20110101 - tb000001_20111231.
      TB has variable size ahd the biggest size among them.
      TB's max size is less than 300KB.
      Most TB's size is much smaller than the max size, though.
      MB and DB sizes are much smaller than TB's.
      Everyday all the newly created blobs will be inserted into TT, MT, and DT table, respectively.
      Therefore 3000 TBs, 3000 MBs, and 3000 DBs will be inserted everyday.
      This insertion will be done only once in one day.
      Selection of a blob for some specific symbol/date is the most common operation.

      Goals to achieve.
      [1] To select a blob for a specific symbol/date as fast as possible.
      This is the most frequent operation.
      [2] To select the last date for a symbol.
      This is an operation to be done once in a day.
      [3] To insert all the 3000 blobs newly created as fast as possible.
      This is an operation to be done once in a day.
      [4] To keep the database size as small as possible.
      Once history blobs are inserted the probability any of them will get deleted/modified
      is very low because they are fact values.
      The database will grow everyday in almost fixed amount.

      Decisions to make to achieve the above goals.
      [1] Single big database for the whole years or yearly databases for each year?
      [2] Single key or multiple key table structure?

      I really don't have much experience with database area,
      I would ask your opinions/suggestions along with pretty detail explanations of why.

      Regarding [1],
      If the growing database size does NOT affect the performance of select/insert,
      naturally I would go with a single big database.
      If it does and will go worse against the performance,
      I would definitely split the databases into yearly ones.

      At the moment, I am currently testing with yearly databases
      because I thought the growing size will affect the performance.

      Also, I create a table with a single Symb_Date key structure.
      It's because having one primary key, I don't have to create an index for a primary key.
      If this decision is too bad for performance, I would like to change it.

      Below are sql statements I am using for your reference.

      - table creation
      CREATE TABLE TT(SYMB_DATE VARCHAR(64) NOT NULL PRIMARY KEY, TB BLOB SUB_TYPE BINARY);

      - selecting the last date a specific symbol => this will be done once everyday for only one symbol
      SELECT FIRST 1 RIGHT(SYMB_DATE, 8) FROM TT WHERE LEFT(SYMB_DATE, 8) = 'S0001004' ORDER BY SYMB_DATE DESC;

      - selecting a blob for a specific symbol/date => this will be done frequently for a random symbol/date
      SELECT TB FROM TT WHERE SYMB_DATE = 'S0001004_20111217';

      - inserting a blob for a specific symbol/date => this will be done once everyday for all symbols
      INSERT INTO TT(SYMB_DATE, TB) VALUES('S1004_20111218', <TB>);

      So, the records in TT will look like
      (s0000001_20110101, tb000001_20110101)
      ...
      (s0000001_20111231, tb000001_20111231)
      ....
      ....
      (s0003000_20110101, tb003000_20110101)
      ...
      (s0003000_20111231, tb003000_20111231)


      I would ask your precious opinions/suggestions for the best database/table configurations
      with the explanation of the rationale.

      Thx a lot.
    Your message has been successfully submitted and would be delivered to recipients shortly.