Loading ...
Sorry, an error occurred while loading the content.

Re: [trimedia] about prefetch and memory copy

Expand Messages
  • Chuck Peplinski
    I know that a version of memcpy optimized for the 1500 is in the pipeline, but I do not have it yet. In general, copying a frame will never be efficient.
    Message 1 of 2 , Dec 31, 2004
    View Source
    • 0 Attachment
      I know that a version of memcpy optimized for the 1500 is in the
      pipeline, but I do not have it yet. In general, copying a frame will
      never be efficient. Best to re-organize your code so that frame copying
      is not required. For instance, if the frame is required for reference,
      keep a pointer and do not re-use the memory until both users (display
      and reference) are done. Or if encoder reference code uses a different
      memory format, change encoder to use TM optimal format.

      Here is some general material about cache operations:
      - If you are using the 1300, forget prefetch. Due to various problems
      (hardware and SW support), most people just wrote it off.
      - If you are using the 1500, it may help you. I do not have a source
      code example on hand.

      The following two cache control custom ops can be useful:

      #include <custom_defs.h>
      ALLOCATE(a, n)
      Tell cache that you are about to overwrite the to the (n) cache line(s)
      at address a. The cache hardware can skip its read operation.
      PREFETCH(a, n)
      Tell cache to bring the n cache lines at address a into cache.
      Considerations:
      The TCS 4.4 compiler does respect the ordering of these ops in C code.
      Older compilers did not.
      Look at the assembly and check the prefetch counters to be sure the ops
      are not discarded.

      Understand that there are conditions under which the prefetch
      instruction will be discarded (not executed). This is usually because
      the resources needed for prefetch are already engaged by another cache
      operation (load). To understand whether this is happening, you will
      have to look at the assembly code and set up the counters to monitor
      prefetch operations. This is tedious stuff, much akin to assembly
      programming.

      The cache counters are described in the 3260 architecture book. Search
      for the "MEM_EVENTS" MMIO register. This selects one of 15 items to
      count. Two counters are typcially used to monitor these events. You
      must select which ones. I have known programmers to run the same data 8
      times so as to get the contents of all counters.

      Some other things to know about prefetch:
      - The PREFETCH instruction asks the cache to fetch data into data cache
      from memory without waiting
      - Requires copy back buffer for 5 cycles to determine LRU (ie, prefetch
      could be discarded because copyback is in progress)
      - Requires refill buffer to do the actual data fetch (ie, prefetch could
      be discarded because other cache fetch is in progress)
      - Will be discarded if buffers are not free
      - Only one prefetch can be scheduled at a time
      - The fetched data replaces the least recently used (LRU) entry in the cache
      - Additional stalls can be avoided if next 5 instructions do not use the
      data path

      And finally, more details about what is actually counted by the prefetch
      counters:
      1011: Prefetch operation:
      Counts one when a prefetch operation is requested
      1100: Prefetch operation discarded (because of cache hit or no resources).
      Counts one when a prefetch operation is discarded for any reason
      1101: Prefetch operation discarded (because of cache hit).
      Counts one when a prefetch is discarded because of a cache hit.
      1110: Instruction cache prefetch.
      1111: Data cache prefetch

      Hope this helps.

      Chuck

      Chuck Peplinski
      TriMedian at MDS www.mds.com




      dahuamymaymy wrote:

      >
      > How to use prefetch such as prefr?
      > I use them in my code,but it it takes more time to run my encoder.
      > could anyone give me an example to tell me where I can use them?
      >
      > and,in order to copy an image to a new one,I use memcpy() to copy a
      > line at a time,it takes 2.5s to copy a 640x480 image.
      > How can I to save the time?
      >


      [Non-text portions of this message have been removed]
    Your message has been successfully submitted and would be delivered to recipients shortly.