Skip navigation
8644 Views 11 Replies Latest reply: Oct 5, 2010 11:54 PM by Pierre Ducrot RSS
Pierre Ducrot Grand Master 296 posts since
Nov 4, 2007
Currently Being Moderated

Sep 21, 2010 9:31 AM

Generate New Data Record (Java)

Hi,

 

I designed a component that collects some information from passed records, then enter into a complex algorithm that eventually generates new data records. One issue with this, is that you have to collect new nodes during the whole process, then release them at the component onProcess level (see below, although I assume that everyone knows).

 

The main issue with this process is when many thousands of nodes are to be collected, you rapidly get a "Java Heap Space" error message.

 

public Component.State onProcess(Context context, DataRecord data) throws Exception {
     if (data.isNewData()) {

          ...

          data.setRoot(node);

          data.routeTo(Route.PassPort);

 

          if (nodeIterator.hasNext()) {

               return Component.State.ReadyForNewData;

          }

          else {

               return Component.State.DoneProcessingData;

          }

     }

     else {

 

          // algorithm here: create nodes and add to the node collection

          nodeIterator = nodeCollection.iterator();

          if (nodeIterator.hasNext()) {

               return Component.State.ReadyNewData;

          }

          else {

               return Component.State.DoneProcessingData;

          }

     }

}         

So, the question is: is there another way to generate new data records which would not loop on "onProcess" and let me generate and send out nodes (data records) directly from the algorithm?
Hope I'm clear enough,
Cheers,
Pierre

  • cagramont Accelrys 22 posts since
    Oct 12, 2009
    Currently Being Moderated
    Sep 22, 2010 3:25 AM (in response to Pierre Ducrot)
    Re: Generate New Data Record (Java)

    Could you provide us with a bit more information so we can help address this?  Are you keeping all of your nodes in some sort of container on the Java side versus keeping a reference to the top-level node?  Also, what method are you using to create nodes in Java?  It's not clear what you mean by "node collection".

  • chris.farmer Grand Master 149 posts since
    Oct 29, 2007
    Currently Being Moderated
    Sep 22, 2010 4:52 AM (in response to Pierre Ducrot)
    Re: Generate New Data Record (Java)

    Hi Pierre,

     

    It's sadly been a long time since I've done something this fun, but why do you have to create all those nodes up front?  Couldn't you just cache the data directly (or keep a pointer to some location in some existing store of queued-up data, such as a file or database) and then populate a node on demand when the new data record is served up to you in your onprocess call?  I have no idea what your custom logic entails, but it seems like you'd have better luck and more flexibility in handling your data outside of a PP node until the point where you must serve up the node at the time it needs to be pushed out into the stream.

     

     

     

    Chris

  • pcochrane Accelrys 7 posts since
    Jun 18, 2009
    Currently Being Moderated
    Sep 22, 2010 5:32 AM (in response to Pierre Ducrot)
    Re: Generate New Data Record (Java)

    Hi Pierre,

     

    I'm interested to know how you're caching the node as they're not meant to be cached in any way other than using the NodeCache object - but since the way you refer to the "node collection", you mention an iterator() method and it doesn't have one, you're clearly not using that. You can get the NodeCache from the Context object and keep that as a member variable of you component class:

     

    dataRecordCache = context.openNodeCache();

     

    You can then add nodes to it:

     

    dataRecordCache.write(data.findRoot());

     

    Where data is your DataRecord object (above is just adding the root node to the cache).

     

    You get them back by index:

     

    dataRecordCache.read(0);


    I'm not sure as to the implementation of this object so I don't know if it's any more efficient with respect to memory usage than the method you're currently using so this might not solve anything.

     

    If it doesn't, I would say that Chris's suggestion is a good one - write out the data you need elsewhere then call it back as you need. The only alternative is to alter the amount of memory the PP JVM is allowed to use. I'm not sure if this is something we recommend or not to be honest. The maximum limit by default is 256M. With that in mind, if you're still running out of memory then perhaps you do need to cache the data to file rather than attempt to keep it in memory. Upping the memory available will always be limiting I guess.

     

    Hope that helps.

     

    Phil.

      • jbenedict Accelrys 23 posts since
        Aug 2, 2007
        Currently Being Moderated
        Sep 22, 2010 7:03 PM (in response to Pierre Ducrot)
        Re: Generate New Data Record (Java)

        Hi Pierre,

         

        Phil's suggestion is a good one. The nodeCache uses the same caching mechanism behind the Cache Writer, Merge, Group Data or any other aggregator components. In this case, it should cache nodes in memory until the memory limit is reached. In PP 7.5 and earlier, this limit was 50 MB. We increased that value to 100 MB for PP 8.0 on 64 bit platforms. Once the node cache reaches the memory limits, it should start writing records to disk. This should prevent the memory accumulation that you see since the default maximum JVM heap size is 256 MB per scisvr process.

         

        Give that a try and let us know if you run into any problems. Don't forget to "relinquish" any new nodes that you create AND the nodeCache when you finish with it. The nodes and the nodeCache are backed by native objects on the C++ side. They are reference counted, but you still need to call relinquish to release the memory allocated to them.

         

        Cheers,

        Jason

            • jbenedict Accelrys 23 posts since
              Aug 2, 2007
              Currently Being Moderated
              Oct 5, 2010 7:12 PM (in response to Pierre Ducrot)
              Re: Generate New Data Record (Java)

              Hi Pierre,

              Nice graphic! It's hard to see what's going on without look at all the code, but I have a theory. I think we can make a few improvements to your code as follows:
              1. Release the previous node prior to generated the next data record:
              if (index > 0)
              {
                   Node previousNode = stack.read(index);
                   if (previousNode != null)
                        previousNode.relinquish();
                  // Make sure you release any Java-side memory by removing the node reference from your stack.
                   stack.set(index, null);
              }
              // You may have to re-juggle the bookkeeping of the stack index at this point to avoid stale pointers
              // in your stack. As long as you don't back-track you might be okay.
              Node node = stack.read(index++);
              data.setRoot(node);
              if (index < stack.getSize())
              {
                   return Component.State.ReadyForNewData;
              }

              2. Release memory in onFinalize:

              // Iterate over all the nodes in onFinalize and relinquish them there. This should catch the last one on the stack, at least.
              public void onFinalize()
              {
                   for (int i = 0; i < stack.size(); i++)
                   {
                      Node node = stack.read(index++);
                      if (node != null)
                      {
                          node.relinquish();
                          stack.set(i, null);
                      }
                   }
              }

              I have to warn you that I haven't tested this theory directly and I coded it directly into the text box. So, please let us know if it doesn't work.
              Cheers,
              Jason

More Like This

  • Retrieving data ...

Bookmarked By (1)

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points