Loading ...
Sorry, an error occurred while loading the content.

Corrupt header problem

Expand Messages
  • polydwarf820
    We ve been dealing with a problem that has occurred a couple of times in the past month or so on different client machines. For one reason or another (we
    Message 1 of 15 , Jan 2, 2002
    View Source
    • 0 Attachment
      We've been dealing with a problem that has occurred a couple of times
      in the past month or so on different client machines.
      For one reason or another (we haven't been able to pin it down), a
      customer calls in, saying that they're data is corrupt. The way we
      solved this the first time, not really understanding what we were
      doing, was to copy the header block (As determined by looking at the
      interbase source code) from a "good" gdb file that had a lot of data
      in it (More than was in the bad file, definitely) to the bad gdb
      file. This seemed to work, the customer could see all of their data,
      and we went on our way.
      However, another customer has called in with the same problem, which
      is starting to give us concern. After trying the trick we tried
      before, we were able to recover some of the data, though not all
      (Some of the tables that should have in the neighborhood of 1000
      records or so have none. The data is in the gdb, as I could see it
      with a hex-editor).
      We narrowed down the issue in the header to the transaction
      references (hdr_oldest_transaction,
      hdr_oldest_active,hdr_next_transaction). If we changed those to
      older transaction references from an old backup of the customer's
      data (They don't have a recent backup, of course), we would recover
      the same amount of data, as wholesale copying of the header block of
      the gdb file gave us.

      When browsing through one of the tables from the recovered data in
      IBConsole, we would get a few "invalid data conversion" errors, but
      we could continue scrolling. Trying to delete the individual record
      that seemed to be causing the problem caused a null record to be
      shown as the next record (The null record wasn't there before the
      attempted delete).
      We attempted a validate on the table, and the standard validation
      validated everything correctly the first time. When we included
      record fragments in the validation we got "Error 335544344. Error
      while trying to read from file. Reached end of file."
      A gbak -v revealed that it didn't think there were records at all in
      the tables that there should be plenty of records in, however the
      gbak -v itself went without issue (I didn't bother trying to do a
      grestore, since the records I want/need weren't reported in the bak)

      The fact that changing the transaction id's helps seems significant,
      though how it is, we're not sure. Possibly a transaction that was in
      the middle of processing got stopped by a machine failure (Hard lock,
      BSOD, etc), and led to corrupting the file? Stepping back the
      transaction ID numbers by 1 didn't help, we're going to be trying
      different values to see if anything makes a difference.
      The table that's giving us the data conversion errors when browsing
      it also would be linked to the root problem, I think, or at least it
      seems it should be linked.

      Has anyone come across anything vaguely similar, or is there some way
      to get record fragments out of a gdb, other than the by-hand method,
      which isn't going to happen? :)
      And on a baser level, if you've seen something like this, what were
      the conditions in which it happened, so we can try to keep it from
      happening again?

      - Jason
    • Helen Borrie
      ... Jason, Because this is a repeating problem, you need to get to the bottom of what is causing the corruption. Two questions to get answers to first are: 1.
      Message 2 of 15 , Jan 2, 2002
      View Source
      • 0 Attachment
        At 11:52 PM 02-01-02 +0000, polydwarf wrote:
        >[big snip]
        >
        >Has anyone come across anything vaguely similar, or is there some way
        >to get record fragments out of a gdb, other than the by-hand method,
        >which isn't going to happen? :)
        >And on a baser level, if you've seen something like this, what were
        >the conditions in which it happened, so we can try to keep it from
        >happening again?

        Jason,

        Because this is a repeating problem, you need to get to the bottom of what is causing the corruption. Two questions to get answers to first are:

        1. Is the database file close to, or over, the file system limit for the server platform? On Windows or Linux the limit is 2 Gb, apparently stretchable to 4Gb if you are using NTFS.

        2. If the server is on a Windows host, are you certain that all users (including those accessing the database with ad hoc client tools like IBConsole) are using the same, correct connection string EVERY TIME?

        This is correct (for TCP/IP):

        SERVERNAME:DISK:\path\ourdata.gdb

        This is incorrect but Windows will let you do it:

        SERVERNAME:DISK:path\ourdata.gdb

        If you have users mixing these two connection string formats, the server treats each client as if it were connecting to its own database and irreparable corruption ensues as night follows day.

        Next: do you have Forced Writes enabled? If you are using a Windows server with Forced Writes disabled, you are skydiving without a parachute and hoping to land in a big fat haystack.

        If you have eliminated these recognised sources of corruption, the next step would be to look for gremlins in the network hardware (esp. if Forced Writes is off).

        These things apart, it's actually pretty hard to corrupt an IB database. Generally it will survive someone tripping over the server cable or a lightning strike that doesn't burn out the HDD...

        As to rescuing record fragments, that's probably going to be a job for an expert if you have already been through the process described at http://www.ibphoenix.com/ibp_db_corr.html

        regards,
        Helen
      • polydwarf820
        ... Nowhere near... GDB file size is ~25 meg. ... Really? I thought all that was needed was SERVERNAME: to make the dclient make the connection to the remote
        Message 3 of 15 , Jan 2, 2002
        View Source
        • 0 Attachment
          > Two questions to get answers to first are:
          >
          > 1. Is the database file close to, or over, the file system limit
          > for the server platform? On Windows or Linux the limit is 2 Gb,
          > apparently stretchable to 4Gb if you are using NTFS.

          Nowhere near... GDB file size is ~25 meg.

          > 2. If the server is on a Windows host, are you certain that all
          > users (including those accessing the database with ad hoc client ???
          > tools like IBConsole) are using the same, correct connection string
          > EVERY TIME?
          >
          > This is correct (for TCP/IP):
          >
          > SERVERNAME:DISK:\path\ourdata.gdb
          >
          > This is incorrect but Windows will let you do it:
          >
          > SERVERNAME:DISK:path\ourdata.gdb
          >
          > If you have users mixing these two connection string formats, the
          > server treats each client as if it were connecting to its own
          > database and irreparable corruption ensues as night follows day.

          Really? I thought all that was needed was SERVERNAME: to make the
          dclient make the connection to the remote machine's Interbase engine
          as a "remote" client.
          Very interesting... I have no idea if our installer is setting this
          correctly (I'd bet it is, but I'm not 100% sure). Then again, it's
          also difficult to rule out the possbility that Joe Schmoe user
          changed something after the software got installed, either.
          Anyways yes we're running windows. It'd be too complicated to try to
          support a database system on Linux, even though we'd like to.

          > Next: do you have Forced Writes enabled? If you are using a
          > Windows server with Forced Writes disabled, you are skydiving
          > without a parachute and hoping to land in a big fat haystack.

          I would guess we do, because according to the link posted below,
          default on NT/2k machines is forced writes on. The machines that do
          all database creation are 2k machines.
          However, being as this gdb gets installed on multiple platforms,
          depending on the customer (Win98, NT, 2K), it kind of depends on what
          the server decides based on it's OS, correct?

          > If you have eliminated these recognised sources of corruption, the
          > next step would be to look for gremlins in the network hardware
          > (esp. if Forced Writes is off).

          Hmm.. Not a pleasant thought, but I'll see what can be found out. :)

          > These things apart, it's actually pretty hard to corrupt an IB
          > database. Generally it will survive someone tripping over the
          > server cable or a lightning strike that doesn't burn out the HDD...

          That's the impression we were under as well, when we switched to it.
          Someone found a story of it being used in tanks, for instance..

          > As to rescuing record fragments, that's probably going to be a job
          > for an expert if you have already been through the process
          > described at http://www.ibphoenix.com/ibp_db_corr.html

          We figured out at the end of the day that we could roll the
          transaction counters back in the gdb's header and at some point (We
          were going by a large number of transactions at a time), the database
          became "openable", with more data than we had before. We're going to
          be writing a program tomorrow to decrement the counters by one,
          attempt to open the gdb file, and then repeat until it can open. The
          idea is we can recover as much data as possible using this approach.
          I don't suppose anyone else has written this already, or tried, and
          figured out that it's not the right thing to do? :)
          In any event, thank you for your input. I'm hoping after a night's
          sleep everything will somehow magically work itself out. :)

          - Jason
        • Helen Borrie
          ... There might be some confusion here. If you ping SERVERNAME successfully, that means that the TCP/IP link to the host machine is working. It doesn t mean
          Message 4 of 15 , Jan 2, 2002
          View Source
          • 0 Attachment
            At 03:58 AM 03-01-02 +0000, you wrote:

            >> 2. If the server is on a Windows host, are you certain that all
            >> users (including those accessing the database with ad hoc client ???
            >> tools like IBConsole) are using the same, correct connection string
            >> EVERY TIME?
            >>
            >> This is correct (for TCP/IP):
            >>
            >> SERVERNAME:DISK:\path\ourdata.gdb
            >>
            >> This is incorrect but Windows will let you do it:
            >>
            >> SERVERNAME:DISK:path\ourdata.gdb
            >>
            >> If you have users mixing these two connection string formats, the
            >> server treats each client as if it were connecting to its own
            >> database and irreparable corruption ensues as night follows day.
            >
            >Really? I thought all that was needed was SERVERNAME: to make the
            >dclient make the connection to the remote machine's Interbase engine
            >as a "remote" client.

            There might be some confusion here. If you ping SERVERNAME successfully, that means that the TCP/IP link to the host machine is working. It doesn't mean necessarily that the network layer between the client program (gds32.dll on a Windows client) is able to find and connect to a specific database on the host machine. IOW, SERVERNAME finds the machine and the *physical* path (local to the SERVERNAME) finds the database file.

            >Very interesting... I have no idea if our installer is setting this
            >correctly (I'd bet it is, but I'm not 100% sure).

            If your installer is setting it, should one assume your application uses the BDE? There's no "server configuration" that identifies the location of databases, unless your installer is writing it into some Registry key or ini file that your application program knows about.

            >Then again, it's
            >also difficult to rule out the possbility that Joe Schmoe user
            >changed something after the software got installed, either.

            If you are using the BDE, then Joe Schmoe could certainly go in and mess up the alias setting in idapi.cfg on his or anyone's machine. Likewise, if Joe S. or anyone else is accessing the database file either locally on the server or from a remote workstation, using IBConsole, DBExplorer, Database Desktop, any of the numerous free or commercial "database manager" tools out there (including the command-line tools that ship with InterBase) and uses that bad connection string, even just once in a while when others are logged in to the database, then the corruption will surely follow.

            >Anyways yes we're running windows. It'd be too complicated to try to
            >support a database system on Linux, even though we'd like to.

            InterBase running on a Linux server is just like a worm farm. It sits there doing good, unnoticed. Unlike a Win server, it doesn't need to be nursed, coaxed or bribed to behave.


            >> Next: do you have Forced Writes enabled? If you are using a
            >> Windows server with Forced Writes disabled, you are skydiving
            >> without a parachute and hoping to land in a big fat haystack.
            >
            >I would guess we do, because according to the link posted below,
            >default on NT/2k machines is forced writes on. The machines that do
            >all database creation are 2k machines.

            Ah. Let's be clear. Forced Writes is a DATABASE setting, not an OS setting. If you are using an InterBase(R) v. 6.x server on any Windows platform and you didn't set Forced Writes ON yourself, then it won't be on. (Same applies to the pre-RC1 betas of Firebird).

            >However, being as this gdb gets installed on multiple platforms,
            >depending on the customer (Win98, NT, 2K), it kind of depends on what
            >the server decides based on it's OS, correct?

            The default InterBase(R) setting for Linux/UNIX servers is Forced Writes ON. For any flavour of Windows, it is OFF. It's kinda like "double jeopardy" since Windows is very poor at honouring lazy writes...in fact, recent evidence seems to indicate that it simply never does it unless and until you shut the server down entirely. This could get quite interesting in a 24/7 operation...


            >We figured out at the end of the day that we could roll the
            >transaction counters back in the gdb's header and at some point (We
            >were going by a large number of transactions at a time), the database
            >became "openable", with more data than we had before. We're going to
            >be writing a program tomorrow to decrement the counters by one,
            >attempt to open the gdb file, and then repeat until it can open. The
            >idea is we can recover as much data as possible using this approach.
            >I don't suppose anyone else has written this already, or tried, and
            >figured out that it's not the right thing to do? :)

            I would want Ann Harrison's opinion about that before I would even consider going in and tinkering like this with a production database.

            >In any event, thank you for your input. I'm hoping after a night's
            >sleep everything will somehow magically work itself out. :)

            My retirement plan consists of buying an autopick in State Lotto every Monday and Wednesday. Some day before I'm 65 I'm just BOUND to land the big one :))

            cheers,
            Helen
          • polydwarf820
            ... Right, but there s such a thing, I ve found, as remote connections and local connections. To wit: Pick two random Win2k boxes. Box1 has the interbase
            Message 5 of 15 , Jan 2, 2002
            View Source
            • 0 Attachment
              > There might be some confusion here. If you ping SERVERNAME
              > successfully, that means that the TCP/IP link to the host machine
              > is working. It doesn't mean necessarily that the network layer
              > between the client program (gds32.dll on a Windows client) is able
              > to find and connect to a specific database on the host machine.
              > IOW, SERVERNAME finds the machine and the *physical* path (local to
              > the SERVERNAME) finds the database file.

              Right, but there's such a thing, I've found, as "remote" connections
              and "local" connections. To wit:

              Pick two random Win2k boxes.
              Box1 has the interbase server and gdb files.
              Box2 is a client.

              Box2 can connect two different ways (Assuming only TCP/IP).
              First way is the way we all know and love:

              box1:c:\database.gdb

              The second way is by mapping a drive (Say Z) to Box1's root c.
              Then, you can connect like:
              z:\database.gdb

              Bsically, using localhost type protocols. Box2's interbase engine is
              responsible for the access in this case. Not good to mix'n'match,
              definitely.
              However, what I thought you were saying was that if your connection
              string looked like box1:c:database.gdb, that was the same in terms of
              engine access as the mapping-the-drive method. Perhaps I
              misunderstood, or not.

              > If your installer is setting it, should one assume your application
              > uses the BDE? There's no "server configuration" that identifies
              > the location of databases, unless your installer is writing it into
              > some Registry key or ini file that your application program knows
              > about.

              Unfortunately, that's correct. We needed to get it working ASAP, and
              the BDE was the easiest route to go for the time being. Our next
              version (Still under development) is using Interbase Express just
              fine. As a side effect, though, transactions are not explicitly done
              at all (In the BDE version. Part of the rewrite involved enabling
              transactions). I'm assuming that TDatabases and TQueries do their
              own transactions intelligently, but you know what they say about
              assuming things.
              However, the connect string is stored in an ini file, not in an
              actual BDE alias. We make our own connection on the fly. This is
              both good and bad. Only one place to check, one setting to check,
              but users feel they understand ini files more than they do
              bdeadmin.exe.


              > >Anyways yes we're running windows. It'd be too complicated to try
              to
              > >support a database system on Linux, even though we'd like to.
              >
              > InterBase running on a Linux server is just like a worm farm. It
              > sits there doing good, unnoticed. Unlike a Win server, it doesn't
              > need to be nursed, coaxed or bribed to behave.

              Don't get me wrong, we would like to run Interbase on Linux, however,
              if anything were to happen, our customers would not be able to deal
              with anything. Windows is at least familiar to people, even those
              that don't know much. Our costs are such that it's not feasible to
              fly someone out in the event a linux box fails... And trying to talk
              someone through a Linux install over the phone? Ugh.

              > Ah. Let's be clear. Forced Writes is a DATABASE setting, not an
              > OS setting. If you are using an InterBase(R) v. 6.x server on any
              > Windows platform and you didn't set Forced Writes ON yourself, then
              > it won't be on. (Same applies to the pre-RC1 betas of Firebird).

              Hrmm.. What I had read basically said the opposite. Oh well.

              > The default InterBase(R) setting for Linux/UNIX servers is Forced
              > Writes ON. For any flavour of Windows, it is OFF. It's kinda
              > like "double jeopardy" since Windows is very poor at honouring lazy
              > writes...in fact, recent evidence seems to indicate that it simply
              > never does it unless and until you shut the server down entirely.
              > This could get quite interesting in a 24/7 operation...

              The document re repairing a database mentioned that on NT/2K it was
              on by default, on 95/98, the setting had no effect, and that the
              server made it off.
              I will definitely be checking the databases at the office tomorrow to
              get a definitive answer.

              > I would want Ann Harrison's opinion about that before I would even
              > consider going in and tinkering like this with a production
              > database.

              Agreed, which is why I'm outlining my plan, in the hopes that someone
              will say "Hey, that's a dumb idea." :)
              All I can say is that the very brief bit of empirical evidence I've
              seen indicates it works.

              - Jason
            • Helen Borrie
              ... No, an InterBase client can t connect to a windows mapped drive. ... No, it is not using localhost protocols. Windows has local connection by which a
              Message 6 of 15 , Jan 2, 2002
              View Source
              • 0 Attachment
                At 06:18 AM 03-01-02 +0000, you wrote:


                >Right, but there's such a thing, I've found, as "remote" connections
                >and "local" connections. To wit:
                >
                >Pick two random Win2k boxes.
                >Box1 has the interbase server and gdb files.
                >Box2 is a client.
                >
                >Box2 can connect two different ways (Assuming only TCP/IP).
                >First way is the way we all know and love:
                >
                >box1:c:\database.gdb
                >
                >The second way is by mapping a drive (Say Z) to Box1's root c.
                >Then, you can connect like:
                >z:\database.gdb

                No, an InterBase client can't connect to a windows mapped drive.


                >Bsically, using localhost type protocols. Box2's interbase engine is
                >responsible for the access in this case. Not good to mix'n'match,
                >definitely.

                No, it is not using localhost protocols. Windows has "local connection" by which a client located on the same *physical* machine as the server + databases connects to the database through a thing called "client-mapped memory". In this case there is no SERVERNAME in the string, it is straight c:\database.gdb. Other machines can't connect to the database that way.

                Then there is also tcp/ip local loopback, again, only possible for a client connecting to a server + database on the same physical machine. In this case the connection string is localhost:c:\database.gdb - or it could be box1:c:\database.gdb if you have an entry in the HOSTS file for the IP address 127.0.0.1:
                127.0.0.1 box1 #local loopback incognito

                >However, what I thought you were saying was that if your connection
                >string looked like box1:c:database.gdb, that was the same in terms of
                >engine access as the mapping-the-drive method. Perhaps I
                >misunderstood, or not.

                You misunderstood. The mapped-drive "method" is not available. The bad string is permitted by Windows as an alternative to the legal tcp/ip string. The problem is, Windows doesn't realise that user A connecting with the bad string and user B connecting with the legal string are actually connecting to the same database file and so it sends the wrong message to the server. The server then treats the two connections as if they were to two different databases; so the work of the two users is not protected by record versioning nor exclusion by transaction control.

                Remember - the server controls potentially many databases, even if your installation involves only one. So the server has no way to know that the two users are not working in different databases, if the operating system tells it that they are.


                >> If your installer is setting it, should one assume your application
                >> uses the BDE? There's no "server configuration" that identifies
                >> the location of databases, unless your installer is writing it into
                >> some Registry key or ini file that your application program knows
                >> about.
                >
                >Unfortunately, that's correct. We needed to get it working ASAP, and
                >the BDE was the easiest route to go for the time being. Our next
                >version (Still under development) is using Interbase Express just
                >fine. As a side effect, though, transactions are not explicitly done
                >at all (In the BDE version. Part of the rewrite involved enabling
                >transactions). I'm assuming that TDatabases and TQueries do their
                >own transactions intelligently, but you know what they say about
                >assuming things.

                Whether you are using explicit transaction control or not is not relevant to the problem of the "bad connection string bug". It's a matter of miscommunication between the operating system and the server and it is caused by clients connecting with these mixed connection string formats. Why I asked about the BDE is that any user of any BDE application can stuff up the BDE alias for connecting to a database and become the source of corruption for the database itself.

                But it's not just in BDE applications it can happen. It can happen in any situation where users are able to connect to a database using a string they designed themselves.

                >However, the connect string is stored in an ini file, not in an
                >actual BDE alias. We make our own connection on the fly. This is
                >both good and bad. Only one place to check, one setting to check,
                >but users feel they understand ini files more than they do
                >bdeadmin.exe.

                Ideally, if users go in an mess around with the connection string in their ini file, registry key or whatever, it should just prevent them from connecting. That's the assumption in the client code, too - either the string is "valid" or the connection will fail. Unfortunately, Windows has this undocumented "feature" whereby it will accept an illegal connection string but without the additional "feature" whereby it knows that the good string and the bad string should be equivalent.


                >> Ah. Let's be clear. Forced Writes is a DATABASE setting, not an
                >> OS setting. If you are using an InterBase(R) v. 6.x server on any
                >> Windows platform and you didn't set Forced Writes ON yourself, then
                >> it won't be on. (Same applies to the pre-RC1 betas of Firebird).
                >
                >Hrmm.. What I had read basically said the opposite. Oh well.
                >
                >> The default InterBase(R) setting for Linux/UNIX servers is Forced
                >> Writes ON. For any flavour of Windows, it is OFF. It's kinda
                >> like "double jeopardy" since Windows is very poor at honouring lazy
                >> writes...in fact, recent evidence seems to indicate that it simply
                >> never does it unless and until you shut the server down entirely.
                >> This could get quite interesting in a 24/7 operation...
                >
                >The document re repairing a database mentioned that on NT/2K it was
                >on by default, on 95/98, the setting had no effect, and that the
                >server made it off.

                Hmmm...it's true that Win 95/98 (and ME too, I believe) don't support delayed (lazy) writes. But that's not the point, since your server is Win2K and it does provide the option of delayed or immediate writes. Borland InterBase(R) v. 6.x for Windows installs with Forced Writes OFF by default or, to look at it from the OS's point of view, "with delayed writes enabled". (Firebird RC2 changed that to Forced Writes ON by default, btw.)

                Helen
              • Ann W. Harrison
                ... Jason, Your analysis is pretty good, though I would like to make some suggestions about reporting problems. First, specify the version of Firebird or
                Message 7 of 15 , Jan 3, 2002
                View Source
                • 0 Attachment
                  At 11:52 PM 1/2/2002 +0000, polydwarf820 (Jason) wrote:


                  >For one reason or another (we haven't been able to pin it down), a
                  >customer calls in, saying that they're data is corrupt.

                  Jason,

                  Your analysis is pretty good, though I would like to make some
                  suggestions about reporting problems. First, specify the version
                  of Firebird or InterBase you're using. Different versions have
                  different problems. Second, give exact error messages - even if
                  that means pressuring your clients to say something more than
                  "my database is corrupt". Third, encourage your clients to backup
                  their databases regularly.

                  That said, your analysis is enough to tell me what happened - with
                  a 90% probability. You're running on a windows operating system and
                  you're using different connect strings for different clients. That
                  causes InterBase V4, V5, & V6 to attach the database once for each
                  connect string, breaking the lowest level of concurrency control -
                  the part that maintains the on-disk consistency.

                  The best solution is to convert to Firebird RC2, which will detect
                  that problem and prevent the corruption. The next best solution is
                  to move your databases from Windows to Linux (or Solaris, or HP, or
                  any grown-up operating system). The third best solution is to be
                  absolutely religious about connection strings - including the strings
                  used by the system administrator with IBConsole.

                  The "end of file" error occurs because the system is looking for a
                  transaction inventory page (80% probability) that did not get written.
                  If you increment the next transaction to some arbitrary value with
                  a hex editor, you may have made the system think that it has more
                  tips than were actually created.

                  The missing data occurs because the header page, which is the source
                  of transaction id's, was not written, so new transactions "think"
                  they're older than they are and that they're not "allowed" to read
                  data that should be committed.

                  If you are using InterBase I would suggest contacting Borland - this
                  is a very dangerous bug and one that they have swept under the table
                  for years. The problem can also show up as "wrong page type" and
                  several other, even less pleasant errors.



                  Regards,

                  Ann
                  www.ibphoenix.com
                  We have answers.
                • polydwarf820
                  ... 1. We re using the latest OpenSource Interbase (6.0.1.6 Open Edition)... Moving to Firebird is definitely an option in the future, however for right now,
                  Message 8 of 15 , Jan 3, 2002
                  View Source
                  • 0 Attachment
                    > Your analysis is pretty good, though I would like to make some
                    > suggestions about reporting problems. First, specify the version
                    > of Firebird or InterBase you're using. Different versions have
                    > different problems. Second, give exact error messages - even if
                    > that means pressuring your clients to say something more than
                    > "my database is corrupt". Third, encourage your clients to backup
                    > their databases regularly.

                    1. We're using the latest OpenSource Interbase (6.0.1.6 Open
                    Edition)... Moving to Firebird is definitely an option in the
                    future, however for right now, we're trying to fix the bugs we have
                    (The devil we know, versus the devil we don't, and all that..
                    Possibly in the new IBX rewrite of our software).
                    2. Exact error message is, when trying to open this database with a
                    client (IBConsole is what gave us the following message), "IO Error
                    for <filename>. Error while trying to read from file. Reached end
                    of file."
                    3. The customer had a "backup" system in place, however it consisted
                    of a nightly backup... to the same media (We didn't know this until
                    we tried to restore from backup, and then they told us what their
                    backup procedure was. Of course, the file was corrupted when they
                    did their last nightly backup). We're in the process of informing
                    their IT controller about good backup procedures.

                    > That said, your analysis is enough to tell me what happened - with
                    > a 90% probability. You're running on a windows operating system and
                    > you're using different connect strings for different clients. That
                    > causes InterBase V4, V5, & V6 to attach the database once for each
                    > connect string, breaking the lowest level of concurrency control -
                    > the part that maintains the on-disk consistency.

                    After discussing more with our support people, I've found that we are
                    definitely doing this.
                    All of the "remote"/"client" workstations (IE machines that the
                    server is not on) are using the same connect string (As set by our
                    installer, which does do a proper connect string, ie
                    SERVERNAME:c:\data\database.gdb).
                    However, the machine that the server does reside on, the connect
                    string is just c:\data\database.gdb... No SERVERNAME, etc. This has
                    been remedied this morning in our install program, so that the server
                    machine's connect string for our application is exactly the same as
                    the client's connect string.

                    > The best solution is to convert to Firebird RC2, which will detect
                    > that problem and prevent the corruption. The next best solution is
                    > to move your databases from Windows to Linux (or Solaris, or HP, or
                    > any grown-up operating system). The third best solution is to be
                    > absolutely religious about connection strings - including the
                    strings
                    > used by the system administrator with IBConsole.

                    Solution 1 is something we'll be discussing. However, our concern is
                    when Interbase and Firebird diverge enough that components to access
                    one won't work to access the other (Specifically Interbase Express).
                    If there are going to be IBX components to do the same job, with at
                    least the same performace, then we're fine.

                    > The "end of file" error occurs because the system is looking for a
                    > transaction inventory page (80% probability) that did not get
                    written.
                    > If you increment the next transaction to some arbitrary value with
                    > a hex editor, you may have made the system think that it has more
                    > tips than were actually created.

                    We were getting the end of file error, before we started hex editing
                    the header of the gdb file.
                    When we started playing with transaction counters, we were
                    decrementing them, not incrementing them, on the assumption that if
                    we could get back to the last "good" transaction, we could then use
                    the gdb file from there, with the amount of data that could be "seen"
                    and whatever was lost was the client's fault, because of their poor
                    backup procedures.
                    My question is, if all of the restore procedures that have been
                    outlined previously don't work, is decrementing the transaction
                    counters like this, to get to the last "good" transaction in the
                    file, a valid approach to take to repair this file and get the
                    customer back up and running in the short term (Probably with a
                    gbak/grestore after the gdb file can be opened, so we have a clean
                    gdb file)? Will there be issues that are not immediately obvious
                    (Aside from the loss of data that's almost sure to result)?

                    - Jason
                  • Leyne, Sean
                    Jason, ... Although I may be a _little_ biased, I think it is safe to say that Firebird is far ahead of IB OE. Firebird has at least 25 fixes to bug which
                    Message 9 of 15 , Jan 3, 2002
                    View Source
                    • 0 Attachment
                      Jason,

                      > 1. We're using the latest OpenSource Interbase (6.0.1.6 Open
                      > Edition)... Moving to Firebird is definitely an option in the
                      > future, however for right now, we're trying to fix the bugs we have
                      > (The devil we know, versus the devil we don't, and all that..
                      > Possibly in the new IBX rewrite of our software).

                      Although I may be a _little_ biased, I think it is safe to say that
                      Firebird is far ahead of IB OE. Firebird has at least 25 fixes to bug
                      which 'plague' IB 6.0 (both OE and certified).

                      Also, Borland committment to the OpenSource product is less than clear,
                      they have changed their committment with the release of 6.5.
                      Originally, they had committed to keep the OE updated with all core
                      engine changes, now it all bug fixes -- they also have a 'unique' view
                      of bug fix vs. new feature.

                      I think that you should be reviewing your options now, rather than
                      later.

                      <snip>


                      > Solution 1 is something we'll be discussing. However, our concern is
                      > when Interbase and Firebird diverge enough that components to access
                      > one won't work to access the other (Specifically Interbase Express).
                      > If there are going to be IBX components to do the same job, with at
                      > least the same performace, then we're fine.

                      Jeff Overcash has already stated that IBX and IBConsole will not support
                      any Firebird custom features -- so unless you plan to be with IB forever
                      (the 'certified' version since OE life span is unknown), you should
                      evaluate your options.

                      Fortunately, if you make a wise decision now, you won't be affected by
                      the any divergance which might occur. Both IBOjects and FIB (the step
                      father to IBX) both have good support for Firebird and Interbase and
                      likely to maintain support for both products.


                      --
                      Sean Leyne

                      There is nothing wrong with Interbase,
                      that can't be fixed with Firebird.
                      http://FirebirdSQL.org
                    • Kaputnik
                      ... There is FIBPlus. You can convert pretty easily from IBX to FIBPlus and the performance and stability are better. Also, there are the IBObjects. These are
                      Message 10 of 15 , Jan 3, 2002
                      View Source
                      • 0 Attachment
                        > Solution 1 is something we'll be discussing. However, our concern is
                        > when Interbase and Firebird diverge enough that components to access
                        > one won't work to access the other (Specifically Interbase Express).
                        > If there are going to be IBX components to do the same job, with at
                        > least the same performace, then we're fine.
                        >
                        There is FIBPlus. You can convert pretty easily from IBX to FIBPlus and the
                        performance and stability are better.
                        Also, there are the IBObjects. These are the top of all.
                        By far better performance, and you have solutions instead of workarounds.
                        Don't hang too tight about IBX, almost everything you replace it with
                        (except the BDE) will be at least as stable and performing than this. I can
                        now even speak of it almost correctly, as I have merged a fairly small
                        programm (20 forms, 35 tables) from a colleague from IBX into our
                        IBO-solution not that long ago now, and working again with IBX was something
                        like a nightmare for me after being pampered by IBObjects. Ah, performance
                        increase was significant :-)

                        Please direct any flames regarding this post to
                        mailto:dont@... :-)

                        Cu, Nick
                      • Helen Borrie
                        ... That latest Open Source InterBase binary is more than six months old. It was (is) the last free binary you will see from Borland. Furthermore,
                        Message 11 of 15 , Jan 3, 2002
                        View Source
                        • 0 Attachment
                          At 04:07 PM 03-01-02 +0000, you wrote:


                          >1. We're using the latest OpenSource Interbase (6.0.1.6 Open
                          >Edition)...

                          That "latest" Open Source InterBase binary is more than six months old. It was (is) the last free binary you will see from Borland. Furthermore, Borland's InterBase general manager Jon Arthur recently announced that the Open Edition won't be receiving any of the changes that they put into their commercial editions. So even if you build your own from source code, it's going to be the same old thing with perhaps a few bugs fixed that were already fixed in Firebird long ago. InterBase(R) Open Edition has no apparent future. Are you prepared to buy commercial licensing? That's what InterBase(R) is all about. Borland isn't a charitable institution, after all.

                          >Moving to Firebird is definitely an option in the
                          >future, however for right now, we're trying to fix the bugs we have
                          >(The devil we know, versus the devil we don't, and all that..
                          >Possibly in the new IBX rewrite of our software).

                          If you are going to rewrite your software with IBX then count on commercial InterBase(R) in your future. IBX is locked into commercial InterBase. Alternatively, move to Firebird and do your future development with IB Objects (which will directly convert your BDE app, a sub-five-minute job) or FIB-Plus.


                          >> That said, your analysis is enough to tell me what happened - with
                          >> a 90% probability. You're running on a windows operating system and
                          >> you're using different connect strings for different clients. That
                          >> causes InterBase V4, V5, & V6 to attach the database once for each
                          >> connect string, breaking the lowest level of concurrency control -
                          >> the part that maintains the on-disk consistency.
                          >
                          >After discussing more with our support people, I've found that we are
                          >definitely doing this.
                          >All of the "remote"/"client" workstations (IE machines that the
                          >server is not on) are using the same connect string (As set by our
                          >installer, which does do a proper connect string, ie
                          >SERVERNAME:c:\data\database.gdb).
                          >However, the machine that the server does reside on, the connect
                          >string is just c:\data\database.gdb... No SERVERNAME, etc. This has
                          >been remedied this morning in our install program, so that the server
                          >machine's connect string for our application is exactly the same as
                          >the client's connect string.

                          A local connect on the server won't be the source of your corruption; and the change you have done won't fix it. The bad path bug is not caused by mixing a local connection with remote connections, it's caused by the anomalous strings as already described to you by Ann and me.

                          It's probably insufficient just to ask your support people. You need to go to the users who reported the problems and find out who is using IBConsole, DBExplorer, Database Desktop, etc. to make ad hoc connections to their databases using the bad string. Look at everyone's BDE setup individually, too. It just takes one.


                          >Solution 1 is something we'll be discussing. However, our concern is
                          >when Interbase and Firebird diverge enough that components to access
                          >one won't work to access the other (Specifically Interbase Express).
                          >If there are going to be IBX components to do the same job, with at
                          >least the same performace, then we're fine.

                          Your options are: Borland InterBase commercial editions + IBX (or, better, IBO or FIB-Plus)

                          or

                          Firebird + IBO or FIBPlus.

                          Don't count on IBX keeping up with Firebird.

                          (Or you could, of course, just stick with poor old untested IB Open Edition 6.0.0.0.0.0.0.1.6 for ever more and spend weeks converting your BDE app to IBX...)

                          HB


                          All for Open and Open for All
                          Firebird Open SQL Database · http://firebirdsql.org
                          _______________________________________________________
                        • Woody
                          From: Helen Borrie
                          Message 12 of 15 , Jan 3, 2002
                          View Source
                          • 0 Attachment
                            From: "Helen Borrie" <helebor@...>
                            <<
                            That "latest" Open Source InterBase binary is more than six months old. It
                            was (is) the last free binary you will see from Borland. Furthermore,
                            Borland's InterBase general manager Jon Arthur recently announced that the
                            Open Edition won't be receiving any of the changes that they put into their
                            commercial editions. So even if you build your own from source code, it's
                            going to be the same old thing with perhaps a few bugs fixed that were
                            already fixed in Firebird long ago. InterBase(R) Open Edition has no
                            apparent future. Are you prepared to buy commercial licensing? That's what
                            InterBase(R) is all about. Borland isn't a charitable institution, after
                            all.
                            >>

                            This is not exactly the case with IB. It was always stated that only bug
                            fixes will immediately go into the open edition. It will probably lag one or
                            two versions behind the commercial version as regards new features, some of
                            which may requires a core engine change. This is not the last open edition
                            version, ASFAIK, and has at least been stated that way by members of Borland
                            on the news groups.

                            <<
                            If you are going to rewrite your software with IBX then count on commercial
                            InterBase(R) in your future. IBX is locked into commercial InterBase.
                            Alternatively, move to Firebird and do your future development with IB
                            Objects (which will directly convert your BDE app, a sub-five-minute job) or
                            FIB-Plus.
                            >>

                            Again, IBX is locked into IB, not necessarily into the commercial version.

                            However, all that said, the choice to use either IB or FB should be made now
                            while they are still so similar in operation. No one knows where they will
                            be 6 months from now or more so it would be easier to convert either way at
                            present rather than wait. I fear that even Jason could be vastly overworked
                            in the future trying to keep up with both versions, but to date, he is the
                            only one doing so efficiently and with apparently less bugs. Let's hope he
                            continues.

                            Woody (the Original)

                            ----------------------
                            Can vegetarians eat animal crackers?
                            George Carlin
                          • Helen Borrie
                            ... Just as a matter of interest, what versions of the BDE and the InterBase driver are you using? What dialect is the database? HB All for Open and Open for
                            Message 13 of 15 , Jan 3, 2002
                            View Source
                            • 0 Attachment
                              At 04:00 AM 04-01-02 +1100, you wrote:
                              >At 04:07 PM 03-01-02 +0000, you wrote:
                              >
                              >
                              >>1. We're using the latest OpenSource Interbase (6.0.1.6 Open
                              >>Edition)...
                              >

                              Just as a matter of interest, what versions of the BDE and the InterBase driver are you using?

                              What dialect is the database?

                              HB


                              All for Open and Open for All
                              Firebird Open SQL Database · http://firebirdsql.org
                              _______________________________________________________
                            • Robert F. Tulloch
                              ... It doesn t takle weeks at all. Maybe a day or two. I am using the old ib and IBX and it works just fine. I am leary of FB because I don t want to get
                              Message 14 of 15 , Jan 3, 2002
                              View Source
                              • 0 Attachment
                                >>Solution 1 is something we'll be discussing. However, our concern is
                                >>when Interbase and Firebird diverge enough that components to access
                                >>one won't work to access the other (Specifically Interbase Express).
                                >>If there are going to be IBX components to do the same job, with at
                                >>least the same performace, then we're fine.
                                >>
                                >
                                > Your options are: Borland InterBase commercial editions + IBX (or, better, IBO or FIB-Plus)
                                >
                                > or
                                >
                                > Firebird + IBO or FIBPlus.
                                >
                                > Don't count on IBX keeping up with Firebird.
                                >
                                > (Or you could, of course, just stick with poor old untested IB Open Edition 6.0.0.0.0.0.0.1.6 for ever more and spend weeks converting your BDE app to IBX...)

                                It doesn't takle weeks at all. Maybe a day or two.

                                I am using the "old" ib and IBX and it works just fine. I am
                                leary of FB because I don't want to get rid of IBX.
                              • Ann W. Harrison
                                ... OK, that version has that problem. ... That s probably a missing TIP. Do you run with forced writes enabled? ... Do encourage them to use gbak to produce
                                Message 15 of 15 , Jan 3, 2002
                                View Source
                                • 0 Attachment
                                  Jason wrote:


                                  >1. We're using the latest OpenSource Interbase (6.0.1.6 Open
                                  >Edition)...

                                  OK, that version has that problem.

                                  >2. Exact error message is, when trying to open this database with a
                                  >client (IBConsole is what gave us the following message), "IO Error
                                  >for <filename>. Error while trying to read from file. Reached end
                                  >of file."

                                  That's probably a missing TIP. Do you run with forced writes enabled?

                                  >3. The customer had a "backup" system in place...

                                  Do encourage them to use gbak to produce the file that they put on
                                  their backup medium. Gbak does not require taking the database off-line
                                  during the backup and tends to clean up little pieces of lint that
                                  build up in the database. If they are copying a running database to
                                  their backup, then they are almost certainly creating corrupt backups.

                                  > > You're running on a windows operating system and
                                  > > you're using different connect strings for different clients.

                                  >After discussing more with our support people, I've found that we are
                                  >definitely doing this.

                                  Ouch.


                                  > > The best solution is to convert to Firebird RC2, which will detect
                                  > > that problem and prevent the corruption.

                                  >Solution 1 is something we'll be discussing. However, our concern is
                                  >when Interbase and Firebird diverge enough that components to access
                                  >one won't work to access the other (Specifically Interbase Express).
                                  >If there are going to be IBX components to do the same job, with at
                                  >least the same performace, then we're fine.

                                  Absent some deliberate efforts on the part of Borland and their
                                  allies to sabotage the interface, Firebird and IBX should work
                                  together for the indefinite future. Such an effort is unlikely
                                  because it would also sabotage Borland customers running older
                                  versions.

                                  > > The "end of file" error occurs because the system is looking for a
                                  > > transaction inventory page (80% probability) that did not get
                                  > > written.
                                  >
                                  >We were getting the end of file error, before we started hex editing
                                  >the header of the gdb file.

                                  That's a common symptom for this particular error.

                                  >When we started playing with transaction counters, we were
                                  >decrementing them, not incrementing them, on the assumption that if
                                  >we could get back to the last "good" transaction, we could then use
                                  >the gdb file from there, with the amount of data that could be "seen"
                                  >and whatever was lost was the client's fault, because of their poor
                                  >backup procedures.

                                  That's actually somewhat backward, though incrementing the counters
                                  has it's problems. One reasonable thing to do is to set the database
                                  to be read-only, then increment the counters. The problem is not that
                                  there are "bad" transactions that

                                  >My question is, if all of the restore procedures that have been
                                  >outlined previously don't work, is decrementing the transaction
                                  >counters like this, to get to the last "good" transaction in the
                                  >file, a valid approach to take to repair this file and get the
                                  >customer back up and running in the short term (Probably with a
                                  >gbak/grestore after the gdb file can be opened, so we have a clean
                                  >gdb file)? Will there be issues that are not immediately obvious
                                  >(Aside from the loss of data that's almost sure to result)?

                                  Here's what I do. Set the database to read only, using a hex-editor
                                  if necessary. Scan through the data, looking for the highest
                                  recorded transaction id (see ods.h for a description of the format
                                  of a data page and a record). Set the next transaction to be higher
                                  than that, and if necessary create a new TIP page (see ods.h again).
                                  Again, if necessary, set the state of transactions on that page to
                                  committed. Finally, do a backup & restore.


                                  Regards,

                                  Ann
                                  www.ibphoenix.com
                                  We have answers.
                                Your message has been successfully submitted and would be delivered to recipients shortly.