<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: NioSax &#8211; Sax style xml parser for Java NIO</title>
	<atom:link href="http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/</link>
	<description>Java, XMPP, Space and pretty much everything else</description>
	<lastBuildDate>Thu, 09 Feb 2012 03:45:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Mina R Waheeb</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-191</link>
		<dc:creator><![CDATA[Mina R Waheeb]]></dc:creator>
		<pubDate>Fri, 25 Mar 2011 20:02:57 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-191</guid>
		<description><![CDATA[According to the javadoc of method decode(ByteBuffer, CharBuffer, boolean)

&lt;blockquote&gt;The buffers positions will be advanced to reflect the bytes read and the characters written, but their marks and limits will not be modified. &lt;/&lt;blockquote&gt;

so if we have [A0][A1][B0], invoking decode method will decode first char successfully but it will fail on the second char with CR_MALFORMED result, my point here is the buffer will still have 1 byte remaining so we can resume decoding again when more bytes available. I have setup a test unit to simulate the network packets.

&lt;code&gt;
//Snip of NioSaxSource
private CharsetDecoder decoder;
private ByteBuffer buf;
private final CharBuffer tmp = CharBuffer.allocate(1);
private int remaning;

public final boolean hasCharacter() {
        return buf != null &amp;&amp; remaning != buf.remaining() &amp;&amp; buf.
                hasRemaining();
}

public final char decode() {
        if (hasCharacter()) {
            tmp.rewind();
            CoderResult result = decoder.decode(buf, tmp, true);
            if (result.isError()) {
                remaning = buf.remaining();
            } else {
                remaning = 0;
                return tmp.get(0);
            }
        }
        return NOT_ENOUGH_DATA;
}
&lt;/code&gt;

And the test unit
&lt;code&gt;
@Test
public void testRandomSequence() throws Exception {
        byte[] b = createXMLUTF8ByteArray();
        Random rand = new Random();
        List parts = new ArrayList();
        int min = 0, max = 0;
        while (max != b.length) {
            max = rand.nextInt(b.length - min) + min;
            if (!parts.contains(max)) {
                if (max == b.length - 1) {
                    max += 1;
                }
                parts.add(max);
            }
            min = max;
        }
        System.out.println(parts + &quot; &quot; + b.length);
        start();
        min = 0;
        ByteBuffer f = ByteBuffer.allocate(b.length);
        source.setByteBuffer(f);
        for (Integer i : parts) {
            f.mark();
            f.limit(i);
            f.position(min);
            f.put(b, min, i - min);
            f.reset();
            min = i;
            parser.parse(source);
        }
        stop();
}
&lt;/code&gt;]]></description>
		<content:encoded><![CDATA[<p>According to the javadoc of method decode(ByteBuffer, CharBuffer, boolean)</p>
<blockquote><p>The buffers positions will be advanced to reflect the bytes read and the characters written, but their marks and limits will not be modified. &lt;/<br />
<blockquote>
<p>so if we have [A0][A1][B0], invoking decode method will decode first char successfully but it will fail on the second char with CR_MALFORMED result, my point here is the buffer will still have 1 byte remaining so we can resume decoding again when more bytes available. I have setup a test unit to simulate the network packets.</p>
<p><code><br />
//Snip of NioSaxSource<br />
private CharsetDecoder decoder;<br />
private ByteBuffer buf;<br />
private final CharBuffer tmp = CharBuffer.allocate(1);<br />
private int remaning;</p>
<p>public final boolean hasCharacter() {<br />
        return buf != null &amp;&amp; remaning != buf.remaining() &amp;&amp; buf.<br />
                hasRemaining();<br />
}</p>
<p>public final char decode() {<br />
        if (hasCharacter()) {<br />
            tmp.rewind();<br />
            CoderResult result = decoder.decode(buf, tmp, true);<br />
            if (result.isError()) {<br />
                remaning = buf.remaining();<br />
            } else {<br />
                remaning = 0;<br />
                return tmp.get(0);<br />
            }<br />
        }<br />
        return NOT_ENOUGH_DATA;<br />
}<br />
</code></p>
<p>And the test unit<br />
<code><br />
@Test<br />
public void testRandomSequence() throws Exception {<br />
        byte[] b = createXMLUTF8ByteArray();<br />
        Random rand = new Random();<br />
        List parts = new ArrayList();<br />
        int min = 0, max = 0;<br />
        while (max != b.length) {<br />
            max = rand.nextInt(b.length - min) + min;<br />
            if (!parts.contains(max)) {<br />
                if (max == b.length - 1) {<br />
                    max += 1;<br />
                }<br />
                parts.add(max);<br />
            }<br />
            min = max;<br />
        }<br />
        System.out.println(parts + " " + b.length);<br />
        start();<br />
        min = 0;<br />
        ByteBuffer f = ByteBuffer.allocate(b.length);<br />
        source.setByteBuffer(f);<br />
        for (Integer i : parts) {<br />
            f.mark();<br />
            f.limit(i);<br />
            f.position(min);<br />
            f.put(b, min, i - min);<br />
            f.reset();<br />
            min = i;<br />
            parser.parse(source);<br />
        }<br />
        stop();<br />
}<br />
</code></p></blockquote>
</blockquote>
]]></content:encoded>
	</item>
	<item>
		<title>By: petermount1</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-190</link>
		<dc:creator><![CDATA[petermount1]]></dc:creator>
		<pubDate>Fri, 25 Mar 2011 14:28:48 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-190</guid>
		<description><![CDATA[The problem with the NIO CharSet API is that it decodes an entire ByteBuffer in one go and it assumes that everything is in that buffer. If the buffer is only partial (for example due to a fragmented network packet) then it will fail.

Heres the problem, say you receive from the network a couple of UTF-16 A &amp; B characters. This would be 4 bytes in total:

[A0][A1][B0][B1]

Due to the network fragmenting the packet the second one is only partially received (the second byte is still in transit). In this case our ByteBuffer contains:

[A0][A1][B0]

Then the nio API would fail as the second character is incomplete. What I do is to decode up the the beginning of the partial but leave it in the ByteBuffer - hence the NOT_ENOUGH_DATA state. Then when you return that buffer to NIO, it then appends the next block from the network which happens to have [B1], the char is then complete and it can be decoded.]]></description>
		<content:encoded><![CDATA[<p>The problem with the NIO CharSet API is that it decodes an entire ByteBuffer in one go and it assumes that everything is in that buffer. If the buffer is only partial (for example due to a fragmented network packet) then it will fail.</p>
<p>Heres the problem, say you receive from the network a couple of UTF-16 A &amp; B characters. This would be 4 bytes in total:</p>
<p>[A0][A1][B0][B1]</p>
<p>Due to the network fragmenting the packet the second one is only partially received (the second byte is still in transit). In this case our ByteBuffer contains:</p>
<p>[A0][A1][B0]</p>
<p>Then the nio API would fail as the second character is incomplete. What I do is to decode up the the beginning of the partial but leave it in the ByteBuffer &#8211; hence the NOT_ENOUGH_DATA state. Then when you return that buffer to NIO, it then appends the next block from the network which happens to have [B1], the char is then complete and it can be decoded.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mina R Waheeb</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-189</link>
		<dc:creator><![CDATA[Mina R Waheeb]]></dc:creator>
		<pubDate>Fri, 25 Mar 2011 14:04:44 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-189</guid>
		<description><![CDATA[Thanks for sharing your code. It works like charm :) due my leak of knowledge about the Charset internals i would suggest use JDK NIO Charset API 

&lt;code&gt;
// Snip of NioSaxSource
public final boolean isValid(final char c) {
      return c != NOT_ENOUGH_DATA &amp;&amp; c != INVALID_CHAR;
}

public final boolean hasCharacter() {
     return buffer != null &amp;&amp; buffer.hasRemaining();
}

public final char decode() {
        if (!hasCharacter()) {
            return NOT_ENOUGH_DATA;
        }
        b.rewind();
        //Where decoder = charset.newDecoder() and b = CharBuffer.allocate(1)
        CoderResult result = decoder.decode(buffer, b, true);
        if (result.isError()) {
            return INVALID_CHAR;
        }
        return b.get(0);
}
&lt;/code&gt;

Thanks again for sharing]]></description>
		<content:encoded><![CDATA[<p>Thanks for sharing your code. It works like charm <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  due my leak of knowledge about the Charset internals i would suggest use JDK NIO Charset API </p>
<p><code><br />
// Snip of NioSaxSource<br />
public final boolean isValid(final char c) {<br />
      return c != NOT_ENOUGH_DATA &amp;&amp; c != INVALID_CHAR;<br />
}</p>
<p>public final boolean hasCharacter() {<br />
     return buffer != null &amp;&amp; buffer.hasRemaining();<br />
}</p>
<p>public final char decode() {<br />
        if (!hasCharacter()) {<br />
            return NOT_ENOUGH_DATA;<br />
        }<br />
        b.rewind();<br />
        //Where decoder = charset.newDecoder() and b = CharBuffer.allocate(1)<br />
        CoderResult result = decoder.decode(buffer, b, true);<br />
        if (result.isError()) {<br />
            return INVALID_CHAR;<br />
        }<br />
        return b.get(0);<br />
}<br />
</code></p>
<p>Thanks again for sharing</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: petermount1</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-97</link>
		<dc:creator><![CDATA[petermount1]]></dc:creator>
		<pubDate>Tue, 24 Aug 2010 16:09:27 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-97</guid>
		<description><![CDATA[Didn&#039;t know about that one :-) Is there a direct link to it?]]></description>
		<content:encoded><![CDATA[<p>Didn&#8217;t know about that one <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Is there a direct link to it?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rémi Forax</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-96</link>
		<dc:creator><![CDATA[Rémi Forax]]></dc:creator>
		<pubDate>Tue, 24 Aug 2010 13:26:26 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-96</guid>
		<description><![CDATA[&gt; No existing API.

Not right :)
Tatoo is a general parser generator that is able to produce NIO based push parsers.
see http://portal.acm.org/citation.cfm?id=1529707

Rémi]]></description>
		<content:encoded><![CDATA[<p>&gt; No existing API.</p>
<p>Not right <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
Tatoo is a general parser generator that is able to produce NIO based push parsers.<br />
see <a href="http://portal.acm.org/citation.cfm?id=1529707" rel="nofollow">http://portal.acm.org/citation.cfm?id=1529707</a></p>
<p>Rémi</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: petermount1</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-94</link>
		<dc:creator><![CDATA[petermount1]]></dc:creator>
		<pubDate>Wed, 18 Aug 2010 09:10:25 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-94</guid>
		<description><![CDATA[It&#039;s not got any validation in there - the core just parses what it receives into a DOM tree which can then be consumed either in it&#039;s entirety or in fragments (it was originally written to support XML streams, specifically XMPP/Jabber).

As for licence, it&#039;s BSD.]]></description>
		<content:encoded><![CDATA[<p>It&#8217;s not got any validation in there &#8211; the core just parses what it receives into a DOM tree which can then be consumed either in it&#8217;s entirety or in fragments (it was originally written to support XML streams, specifically XMPP/Jabber).</p>
<p>As for licence, it&#8217;s BSD.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Santhosh Kumar T</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-93</link>
		<dc:creator><![CDATA[Santhosh Kumar T]]></dc:creator>
		<pubDate>Fri, 13 Aug 2010 19:41:20 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-93</guid>
		<description><![CDATA[i have been searching for such sax parser quite few times. It Looks impressing.
is nioxml completely xml spec compliant. are there any limitations?
and what is the license?

I will look into more details of into your project. 
BTW, I also have a project which contains some Core libraries. If you are interested you can have a look at
http://code.google.com/p/jlibs/]]></description>
		<content:encoded><![CDATA[<p>i have been searching for such sax parser quite few times. It Looks impressing.<br />
is nioxml completely xml spec compliant. are there any limitations?<br />
and what is the license?</p>
<p>I will look into more details of into your project.<br />
BTW, I also have a project which contains some Core libraries. If you are interested you can have a look at<br />
<a href="http://code.google.com/p/jlibs/" rel="nofollow">http://code.google.com/p/jlibs/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: petermount1</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-85</link>
		<dc:creator><![CDATA[petermount1]]></dc:creator>
		<pubDate>Mon, 28 Jun 2010 12:36:49 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-85</guid>
		<description><![CDATA[No, because I must support NIO based XML streams, so the problem there is that I need something similar to STaX but in a push rather than a pull configuration. Also because character sets other than ASCII must be supported I had to handle the possibility of the stream stopping part way through due to the data being split up as it&#039;s sent over the network. No existing API out there supports that (without parsing from the beginning again), hence ending up writing one from scratch.

Also, the output had to be DOM as that is then passed on to other frameworks (specifically JAXB in my case) so a non-standard framework would be out.]]></description>
		<content:encoded><![CDATA[<p>No, because I must support NIO based XML streams, so the problem there is that I need something similar to STaX but in a push rather than a pull configuration. Also because character sets other than ASCII must be supported I had to handle the possibility of the stream stopping part way through due to the data being split up as it&#8217;s sent over the network. No existing API out there supports that (without parsing from the beginning again), hence ending up writing one from scratch.</p>
<p>Also, the output had to be DOM as that is then passed on to other frameworks (specifically JAXB in my case) so a non-standard framework would be out.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: anon_anon</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-84</link>
		<dc:creator><![CDATA[anon_anon]]></dc:creator>
		<pubDate>Mon, 28 Jun 2010 08:30:48 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-84</guid>
		<description><![CDATA[Instead of SAX, have you investigated vtd-xml?]]></description>
		<content:encoded><![CDATA[<p>Instead of SAX, have you investigated vtd-xml?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: NioSax – Sax style xml parser for Java NIO « Retep Open Source - xml</title>
		<link>http://blog.retep.org/2010/06/25/niosax-sax-style-xml-parser-for-java-nio/#comment-83</link>
		<dc:creator><![CDATA[NioSax – Sax style xml parser for Java NIO « Retep Open Source - xml]]></dc:creator>
		<pubDate>Fri, 25 Jun 2010 20:12:04 +0000</pubDate>
		<guid isPermaLink="false">http://blog.retep.org/?p=171#comment-83</guid>
		<description><![CDATA[[...] artykuł: NioSax – Sax style xml parser for Java NIO « Retep Open Source      Tags: been-, contain-partial, i-e-only, nio, possible-for, push-parser, sax, stream-has, xml [...]]]></description>
		<content:encoded><![CDATA[<p>[...] artykuł: NioSax – Sax style xml parser for Java NIO « Retep Open Source      Tags: been-, contain-partial, i-e-only, nio, possible-for, push-parser, sax, stream-has, xml [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

