Loading ...
Sorry, an error occurred while loading the content.

Re: [midatlanticretro] Re: vintage SRAMs self healing

Expand Messages
  • Mike Loewen
    ... http://sturgeon.css.psu.edu/~mloewen/Q7/scifi/AlienProbe/ Mike Loewen mloewen@cpumagic.scol.pa.us Old Technology
    Message 1 of 18 , Nov 13, 2012
    View Source
    • 0 Attachment
      On Tue, 13 Nov 2012, Evan Koblentz wrote:

      >>> The real question is if these components remain "healed" if you leave this sitting in your parts bin for a few weeks and testing them again. I suspect the depletion region will start to deteriorate over time again.
      >
      > Finally, a valid MARCH science experiment that doesn't involve beer and alien probes.

      http://sturgeon.css.psu.edu/~mloewen/Q7/scifi/AlienProbe/


      Mike Loewen mloewen@...
      Old Technology http://sturgeon.css.psu.edu/~mloewen/Oldtech/
    • s100doctor
      ... Hard to know the cause, if the effect can t be duplicated. A proper kind of test about heat versus corroded pins would possibly be something like this:
      Message 2 of 18 , Nov 13, 2012
      View Source
      • 0 Attachment
        --- In midatlanticretro@yahoogroups.com, "Mike" <mike@...> wrote:
        >
        > Unfortunately I could not identify any issue related to connectivity, despite efforts to do so. A number of the parts that initially failed, never failed again when I restarted the test without touching anything at all.
        >
        > Wish I could discover some kind of software or hardware problem in the test setup, but so far, I haven't found any smoking guns.
        >
        > regards,
        > Mike W.

        Hard to know the cause, if the effect can't be duplicated. A proper kind of test about "heat" versus "corroded pins" would possibly be something like this:

        Take samples of the RAMS in question BEFORE use, and divide them up into a number of piles. One pile is inserted in a test board and tested as previously described.

        Another pile is heat cycled to simulate testing, but not connected up, no DC power no socketing.

        Another pile is put into and out of sockets, but not heated and not powered up.

        Another pile has NOTHING done.

        Do the testing and heating over the piles as described. Then, do a run of memory tests for ALL piles. See where the errors are. Run memory tests a number of times on ALL piles, see if errors go down. Report results.

        Frankly, one would have to build a one-RAM memory tester to do this, or have bunches of RAM to test. Since it's unlikely the set of RAMS are of the same vintage and manufacturer and date of manufacture, I doubt one could "control" for these variables. It's something *I* may be able to do if I buy one lot of 2102's I"m being offered, but....I'm not likely to, it's a lot of work.

        The more REASONABLE hypothesis, is that corrosion of sockets and IC pins caused the initial failures; repeated removal/insertion removed the corrosion. OR...the test computer worked "better", the RAM timing shifted, as it warmed up. Mid-1970's computer designs were not often "stable", and just a little difference in capacitance on the lines (address, data, clocks) could affect marginal performance. I assume a 2102 memory card is old and early, as more dense chips were available later.

        There's a general degrading of memory "speed" with age, which I have assumed was due to semiconductor junctions getting "mushy" after three decades. But as I suggested, it could also be due to "mushy" components on the board. If someone designed a RAM tester with variable speed and timing, one could pin down some of these conditions by tweaking them.

        Practical testing I have done, is to run RAM at faster and faster speed until failure - then use it only at slower speeds. Also, run it "hot" (use a hair dryer) and run it "cold" (stick board in freezer). That lets me identify marginal chips.

        Side notes: Z80's are kinder to RAM than 8080's at a given clock speed. But Z80 instruction fetches are MORE demanding than memory reads/writes. Consequently, you can "ram test" a board with reads and writes and it will pass, but if you EXECUTE on that board it will FAIL. A classic Z80 memory test was called "worm". It moved a block of code up in memory, ran the code which if successful printed a memory address to console, then repeated the move/run/print. It "failed" when the console wasn't updated anymore, crashing into ROM or the top of RAM or a bad location. the console showed the last good running address.

        That's how it was done, in the era.

        Herb Johnson
      • Mike
        HI Dan, I m suspicious of some sort of physical anomaly, like what you are describing. I would have thought that it would be discovered before, and described
        Message 3 of 18 , Nov 13, 2012
        View Source
        • 0 Attachment
          HI Dan,

          I'm suspicious of some sort of physical anomaly, like what you are describing. I would have thought that it would be discovered before, and described in some kind of literature, somewhere.

          I've separated a batch that initially failed and will retest again in a couple of weeks.

          regards,
          Mike Willegal

          --- In midatlanticretro@yahoogroups.com, Dan Roganti <ragooman@...> wrote:
          >
          >
          > This is a very interesting anomaly. I personally can't recall this
          > happening before as I haven' tried this method. I would throw out the IC
          > in question if it failed a memory test.
          > However, this part is one of the first times they used depletion-mode N
          > channel Mosfets on the die, versus enhancement mode type. The geometry
          > of the mosfet incorporates a depletion mode region which isolates the
          > source, drain and gate regions - this is besides the gate insulation.
          > This is the main distinction over the original mosfet type making it's
          > performance comparable to the bipolar Rams of that day.
          > This is a highly capacitive layer within the mosfet. I wouldn't doubt a
          > 40yr old component with this type of process would suffer from some kind
          > of leakage. I would garner a strong a suspicion that you are
          > experiencing the mosfet becoming rejuvenated, to a certain degree, by
          > running the test repeatedly.
          >
          > The real question is if these components remain "healed" if you leave
          > this sitting in your parts bin for a few weeks and testing them again. I
          > suspect the depletion region will start to deteriorate over time again.
          >
          > Dan
          >
        • Mike
          Hi Herb, These are individually tested (except for speed) in a purpose built fixture. http://willegal.net/superproto/index.php?title=2102_SRAM_Tester Looking
          Message 4 of 18 , Nov 13, 2012
          View Source
          • 0 Attachment
            Hi Herb,

            These are individually tested (except for speed) in a purpose built fixture.
            http://willegal.net/superproto/index.php?title=2102_SRAM_Tester

            Looking closely at the parts, there is some corrosion on pins, maybe that has something to do with it, though it is not clear why they start passing, even without touching them.

            regards,
            Mike W.

            --- In midatlanticretro@yahoogroups.com, "s100doctor" <hjohnson@...> wrote:
            >
            >
            >
            > --- In midatlanticretro@yahoogroups.com, "Mike" <mike@> wrote:
            > >
            > > Unfortunately I could not identify any issue related to connectivity, despite efforts to do so. A number of the parts that initially failed, never failed again when I restarted the test without touching anything at all.
            > >
            > > Wish I could discover some kind of software or hardware problem in the test setup, but so far, I haven't found any smoking guns.
            > >
            > > regards,
            > > Mike W.
            >
            > Hard to know the cause, if the effect can't be duplicated. A proper kind of test about "heat" versus "corroded pins" would possibly be something like this:
            >
            > Take samples of the RAMS in question BEFORE use, and divide them up into a number of piles. One pile is inserted in a test board and tested as previously described.
            >
            > Another pile is heat cycled to simulate testing, but not connected up, no DC power no socketing.
            >
            > Another pile is put into and out of sockets, but not heated and not powered up.
            >
            > Another pile has NOTHING done.
            >
            > Do the testing and heating over the piles as described. Then, do a run of memory tests for ALL piles. See where the errors are. Run memory tests a number of times on ALL piles, see if errors go down. Report results.
            >
            > Frankly, one would have to build a one-RAM memory tester to do this, or have bunches of RAM to test. Since it's unlikely the set of RAMS are of the same vintage and manufacturer and date of manufacture, I doubt one could "control" for these variables. It's something *I* may be able to do if I buy one lot of 2102's I"m being offered, but....I'm not likely to, it's a lot of work.
            >
            > The more REASONABLE hypothesis, is that corrosion of sockets and IC pins caused the initial failures; repeated removal/insertion removed the corrosion. OR...the test computer worked "better", the RAM timing shifted, as it warmed up. Mid-1970's computer designs were not often "stable", and just a little difference in capacitance on the lines (address, data, clocks) could affect marginal performance. I assume a 2102 memory card is old and early, as more dense chips were available later.
            >
            > There's a general degrading of memory "speed" with age, which I have assumed was due to semiconductor junctions getting "mushy" after three decades. But as I suggested, it could also be due to "mushy" components on the board. If someone designed a RAM tester with variable speed and timing, one could pin down some of these conditions by tweaking them.
            >
            > Practical testing I have done, is to run RAM at faster and faster speed until failure - then use it only at slower speeds. Also, run it "hot" (use a hair dryer) and run it "cold" (stick board in freezer). That lets me identify marginal chips.
            >
            > Side notes: Z80's are kinder to RAM than 8080's at a given clock speed. But Z80 instruction fetches are MORE demanding than memory reads/writes. Consequently, you can "ram test" a board with reads and writes and it will pass, but if you EXECUTE on that board it will FAIL. A classic Z80 memory test was called "worm". It moved a block of code up in memory, ran the code which if successful printed a memory address to console, then repeated the move/run/print. It "failed" when the console wasn't updated anymore, crashing into ROM or the top of RAM or a bad location. the console showed the last good running address.
            >
            > That's how it was done, in the era.
            >
            > Herb Johnson
            >
          • Dan Roganti
            ... I suggest separating this Initial Fail batch into groups, which you can organize a testing profile. I m not sure how many you have there to use as test
            Message 5 of 18 , Nov 13, 2012
            View Source
            • 0 Attachment


              On Tue, Nov 13, 2012 at 7:10 PM, Mike <mike@...> wrote:
              HI Dan,

              I'm suspicious of some sort of physical anomaly, like what you are describing.  I would have thought that it would be discovered before, and described in some kind of literature, somewhere.

              I've separated a batch that initially failed and will retest again in a couple of weeks.

              I suggest separating this "Initial Fail" batch into groups, which you can organize a testing profile. I'm not sure how many you have there to use as test samples. But where each group is tested at a incremental time period, say weekly intervals. One group can be 1wk later, the next group is 2wks later, and so on up to a month or two. You want to avoid influencing the state of the mosfet cells, so by testing the next sample in the lot, you can observe if there is actually any amount of deterioration.

              Dan

            • hornbetw
              Mike, You may be running into capacitance issues using the plugin breadboard and ribbon cable. The plugin boards have fairly high capacitance between rows of
              Message 6 of 18 , Nov 13, 2012
              View Source
              • 0 Attachment
                Mike,

                You may be running into capacitance issues using the plugin breadboard and ribbon cable. The plugin boards have fairly high capacitance between rows of connections.

                Tom
              • corey986
                Tom, I have the same test setup as Mike. I m not sure it s capacitance causing the failures , for all we know it s helping revive the chips ;) I have tested
                Message 7 of 18 , Nov 14, 2012
                View Source
                • 0 Attachment
                  Tom,

                  I have the same test setup as Mike. I'm not sure it's capacitance causing the "failures", for all we know it's helping revive the chips ;)

                  I have tested known bad chips and had this happen. Some seem to "walk" the memory test failure point from a restart of the test failing at different addresses each time, eventually not failing afterwards and being good. Not all my bad chips but enough to be suspicious. I also know it's not dirty leads on the chips, because first thing I did with the bad memory card after I started having problems was remove the chips and soak them in ISP and brush the leads with a tooth brush. Only fixed a single chip out of a lot of 64 on a card with at least 18 known bad chips...

                  So I pulled out the 2102 ram tester setup and started testing the ram that was in the card individually. That is when I saw this weird effect. I know the ram tester worked, because I tested a bunch of NOS ram before I installed it in another card and then ran a system level RAM test and all was good. The same in system RAM test that failed on the card that caused all my recent testing.

                  I'm really just wondering if no one has seen this before because if they had a bad chip, they throw them out.

                  Cheers,
                  Corey


                  --- In midatlanticretro@yahoogroups.com, "hornbetw" <hornbetw@...> wrote:
                  >
                  > Mike,
                  >
                  > You may be running into capacitance issues using the plugin breadboard and ribbon cable. The plugin boards have fairly high capacitance between rows of connections.
                  >
                  > Tom
                  >
                • s100doctor
                  ... I had not realized, you (both) were using a custom one-chip tester. Also - what s ISP ? I m impressed with the effort you made, to program up a 6522 to
                  Message 8 of 18 , Nov 14, 2012
                  View Source
                  • 0 Attachment
                    --- In midatlanticretro@yahoogroups.com, corey986 <no_reply@...> wrote:
                    >
                    >
                    > Tom,
                    >
                    > I have the same test setup as Mike. I'm not sure it's capacitance causing the "failures", for all we know it's helping revive the chips ;)
                    >
                    > I have tested known bad chips and had this happen. Some seem to "walk" the memory test failure point from a restart of the test failing at different addresses each time, eventually not failing afterwards and being good. Not all my bad chips but enough to be suspicious. I also know it's not dirty leads on the chips, because first thing I did with the bad memory card after I started having problems was remove the chips and soak them in ISP and brush the leads with a tooth brush. Only fixed a single chip out of a lot of 64 on a card with at least 18 known bad chips...
                    >
                    > So I pulled out the 2102 ram tester setup and started testing the ram that was in the card individually. That is when I saw this weird effect. I know the ram tester worked, because I tested a bunch of NOS ram before I installed it in another card and then ran a system level RAM test and all was good. The same in system RAM test that failed on the card that caused all my recent testing.
                    >
                    > I'm really just wondering if no one has seen this before because if they had a bad chip, they throw them out.
                    >
                    > Cheers,
                    > Corey---
                    >
                    I had not realized, you (both) were using a custom one-chip tester.

                    Also - what's "ISP"?

                    I'm impressed with the effort you made, to program up a 6522 to test static RAMS. Both an 1101 and now a 2102. Nice go/no-go testing. But what I see in the tester (for those who didn't read the linked page, even I haven't run through the software in detail), is that you have a parallel I/O chip being driven by software, which drives one RAM chip; and you have the chip in test at the other end of several inches of flatcable, and in a protostrip.

                    What that tells me - and this is not a complaint, it's a description - is that the RAM chip is being run VERY slowly, well below spec; and there's relatively plenty of capacitance on every pin; and possibly a certain amount of crosstalk between those lines. An actual read/write cycle time could be measured with an oscilloscope; that could also show any crosstalk (incorrect signals or timing). I'd be curious to know the "speed".

                    So I'd call it a "go/no-go" tester because it doesn't test for speed, only for function. I"d imagine that any chip which failed that tester, would have to be near dead. But any chip that succeeds, may yet fail if run on a memory board at CPU speed.

                    another problem with comparing results under "use", is that some S-100 RAM and CPU cards were not terribly well-designed, or are set up for wait states. So chips that fail ONE board or CPU, may run on another. Even putting a terminator on the S-100 bus can make a difference. I already said Z8's and 8080's run RAM differently.

                    Thanks for calling out evidence that it's not "corrosion". If cleaned chips don't run better than "dirty" chips, to a few percent gain, that seems to remove that consideration. There's all kinds of corrosion issues in 1970's technology.

                    Again - that's a nice little tester. Go/no-go rapid testing of individual RAM chips is very useful, and can be adapted for different chips as needed. I'd like to point to your tester from my Web pages on repair/restoration.

                    Herb Johnson
                  • Dan Roganti
                    Mike, I didn t see your link before and just took a look at your homebrew tester. It really needs some more work to tighten up the circuit before proceeding
                    Message 9 of 18 , Nov 14, 2012
                    View Source
                    • 0 Attachment

                      Mike,

                      I didn't see your link before and just took a look at your homebrew tester. It really needs some more work to tighten up the circuit before proceeding with further experiments. So that you will have a good baseline to work with. I don't think it's a complete kludge :) but you really like to have a solid working platform. Yours can can be done with some minimal steps - it's the same wiring but now you have to solder them :) You can dedicate a type of daughter card which would hold the Ram test socket -- or dedicate the whole Superproto card as a tester. The daughter card would plug into the top(or rear) of your SuperProto card. You can wire a generic I/O interface on the SuperProto card from the 6522. Use some right-angle headers/connectors to mate the two on top of the SuperProto card. Have one daughter card dedicated for the 2102 and another for the 1101.

                      The important issue is to keep the wiring short, point to point, use plenty of ground pins on the headers/connectors - the typical layout is to alternate the signals with grounds along the headers pins to reduce Crosstalk - and having plenty of ground pins helps to avoid Ground Bounce when many signals are switching simultaneously on the bus. Since there's a minimal amount of passive components on there, you can improve the quality even further prior to adding the Ram tester circuit on there. Create a ground plane on there using the grid of plated through-holes - say a grid with a gap of 300mils and connect each side of the grid straight to the ground bus.  The more pcb area thats covered with a ground plane improves the signal integrity.

                      BTW, the capacitance on a ribbon cable is not any significant amount higher than a FR4 PCB - approx 2pf/inch vs 1pf/inch. Ribbon cables have been used everywhere for the past 40years on boards fatter than my butt. It's all in how you organize the signals. This is where you reduce the crosstalk/signal intergrity issues.
                      Dan

                    • s100doctor
                      ... Create a ... Dan, I don t know if you read my most-recent post, before writing yours. Even though I also mentioned crosstalk and noise and such - frankly,
                      Message 10 of 18 , Nov 15, 2012
                      View Source
                      • 0 Attachment
                        --- In midatlanticretro@yahoogroups.com, Dan Roganti <ragooman@...> wrote:
                        >
                        > Mike,
                        >
                        > I didn't see your link before and just took a look at your homebrew tester.
                        > It really needs some more work to tighten up the circuit before proceeding
                        > with further experiments.
                        > The important issue is to keep the wiring short, point to point...
                        Create a
                        > ground plane on there using the grid of plated through-holes...

                        Dan, I don't know if you read my most-recent post, before writing yours. Even though I also mentioned crosstalk and noise and such - frankly, I don't think that matters, with this tester. He's using a bunch of latches under software control, to operate the RAM. It's going to run slowly enough, that any switching-around noise won't last long enough to matter - much. (I suppose there could be some situation where it does, if one worked at it.)

                        Seems to me...if he's to go to the trouble of building a proper, ground-planed, short-wire fixture....he may as well design something that connects the RAM right to the processor, and run it at CPU clock speeds. That is, under design conditions.

                        THEN all that stuff you describe, matters. And THEN, with such a real-use kind of tester, with "good" signals all around, then one can do some more serious testing, and draw more serious conclusions.

                        Again - it's a reasonable bench-top, put-together-for-use, kind of tester. But to use it to make claims about "self-healing RAM" phenomena, I think is pushing its limitations.

                        Herb Johnson
                      • Mike
                        HI, Boy, you guys are tough reviewers. :-) Other than testing speed, this test is about as complete as it can get. Signals look remarkably good, edges are
                        Message 11 of 18 , Nov 15, 2012
                        View Source
                        • 0 Attachment
                          HI,

                          Boy, you guys are tough reviewers. :-)

                          Other than testing speed, this test is about as complete as it can get. Signals look remarkably good, edges are fine with no ringing, overshoot or undershoot.

                          The nice thing about this tester is how quickly I was able to put it together. 3 evenings for the 1101 tester and a couple of more hours to create the 2102 version. Directly interfacing to a processor would have taken considerably longer.

                          The other interesting thing is that I tested 200 1101 parts, with only 1 part that I damaged during development of the test, failing.

                          I'm currently investigating whether tarnish is a contributing factor.

                          Regards,
                          MIke W.

                          --- In midatlanticretro@yahoogroups.com, "s100doctor" <hjohnson@...> wrote:
                          >
                          >
                          >
                          > --- In midatlanticretro@yahoogroups.com, Dan Roganti <ragooman@> wrote:
                          > >
                          > > Mike,
                          > >
                          > > I didn't see your link before and just took a look at your homebrew tester.
                          > > It really needs some more work to tighten up the circuit before proceeding
                          > > with further experiments.
                          > > The important issue is to keep the wiring short, point to point...
                          > Create a
                          > > ground plane on there using the grid of plated through-holes...
                          >
                          > Dan, I don't know if you read my most-recent post, before writing yours. Even though I also mentioned crosstalk and noise and such - frankly, I don't think that matters, with this tester. He's using a bunch of latches under software control, to operate the RAM. It's going to run slowly enough, that any switching-around noise won't last long enough to matter - much. (I suppose there could be some situation where it does, if one worked at it.)
                          >
                          > Seems to me...if he's to go to the trouble of building a proper, ground-planed, short-wire fixture....he may as well design something that connects the RAM right to the processor, and run it at CPU clock speeds. That is, under design conditions.
                          >
                          > THEN all that stuff you describe, matters. And THEN, with such a real-use kind of tester, with "good" signals all around, then one can do some more serious testing, and draw more serious conclusions.
                          >
                          > Again - it's a reasonable bench-top, put-together-for-use, kind of tester. But to use it to make claims about "self-healing RAM" phenomena, I think is pushing its limitations.
                          >
                          > Herb Johnson
                          >
                        • s100doctor
                          ... I largely agree with you - this is a nice bit of work. One could test most any bit of logic with variations of this Apple II code and hardware. And it s
                          Message 12 of 18 , Nov 16, 2012
                          View Source
                          • 0 Attachment
                            --- In midatlanticretro@yahoogroups.com, "Mike" <mike@...> wrote:
                            >
                            >
                            > HI,
                            >
                            > Boy, you guys are tough reviewers. :-)
                            >
                            > Other than testing speed, this test is about as complete as it can get. Signals look remarkably good, edges are fine with no ringing, overshoot or undershoot.
                            >
                            > The nice thing about this tester is how quickly I was able to put it together. 3 evenings for the 1101 tester and a couple of more hours to create the 2102 version. Directly interfacing to a processor would have taken considerably longer.
                            >
                            > The other interesting thing is that I tested 200 1101 parts, with only 1 part that I damaged during development of the test, failing.
                            >
                            > I'm currently investigating whether tarnish is a contributing factor.
                            >
                            > Regards,
                            > MIke W.

                            I largely agree with you - this is a nice bit of work. One could test most any bit of logic with variations of this Apple II code and hardware. And it's not hard to do the same logic, with a Z80, a 6800, etc. It might be fun to use a microKIM, could even use most of your code! Plus, the microKIM could probably implement an at-speed tester too, plenty of address space "open". Even I, "the S-100 guy", have one of those.

                            I'm curious...I think, reading your read/write subroutine, you are probably running the RAM at say 20-25 microseconds? Given an Apple II at 1MHz? If you looked at the signals, you probably know how fast your scope was sweeping to see one access time.

                            I'm inspired to make something like this, if I get a Z80 prototype running next year. Thanks for keeping the 1970's on the bleeding edge again. Well....maybe the leaking edge.....;)

                            Herb Johnson
                          Your message has been successfully submitted and would be delivered to recipients shortly.