Skip to content Skip to sidebar Skip to footer

Skip Adding Empty Tables To Pdf When Parsing Xhtml Using Itextsharp

ITextSharp throws an error when you attempt to create a PdfTable with 0 columns. I have a requirement to take XHTML that is generated using an XSLT transformation and generate a PD

Solution 1:

You should be able to write your own tag processor that accounts for that scenario by subclassing iTextSharp.tool.xml.html.AbstractTagProcessor. In fact, to make your life even easier you can subclass the already existing more specific iTextSharp.tool.xml.html.table.Table:

publicclassTableTagProcessor : iTextSharp.tool.xml.html.table.Table {

    publicoverrideIList<IElement> End(IWorkerContext ctx, Tag tag, IList<IElement> currentContent) {
        //See if we've got anything to work withif (currentContent.Count > 0) {
            //If so, let our parent class worry about itreturnbase.End(ctx, tag, currentContent);
        }

        //Otherwise return an empty list which should make everyone happyreturnnew List<IElement>();
    }
}

Unfortunately, if you want to use a custom tag processor you can't use the shortcut XMLWorkerHelper class and instead you'll need to parse the HTML into elements and add them to your document. To do that you'll need an instance of iTextSharp.tool.xml.IElementHandler which you can create like:

publicclassSampleHandler : iTextSharp.tool.xml.IElementHandler {
    //Generic list of elementspublic List<IElement> elements = new List<IElement>();
    //Add the supplied item to the listpublicvoidAdd(IWritable w) {
        if (w is WritableElement) {
            elements.AddRange(((WritableElement)w).Elements());
        }
    }
}

You can use the above with the following code which includes some sample invalid HTML.

//Hold everything in memory
using (varms=newMemoryStream()) {

    //Create new PDF document 
    using (vardoc=newDocument()) {
        using (varwriter= PdfWriter.GetInstance(doc, ms)) {

            doc.Open();

            //Sample HTMLstringhtml="<table><tr><td>Hello</td></tr></table><table></table>";

            //Create an instance of our element helpervarXhtmlHelper=newSampleHandler();

            //Begin pipelinevarhtmlContext=newHtmlPipelineContext(null);

            //Get the default tag processorvartagFactory= iTextSharp.tool.xml.html.Tags.GetHtmlTagProcessorFactory();

            //Add an instance of our new processor
            tagFactory.AddProcessor(newTableTagProcessor(), newstring[] { "table" });

            //Bind the above to the HTML context part of the pipeline
            htmlContext.SetTagFactory(tagFactory);

            //Get the default CSS handler and create some boilerplate pipeline stuffvarcssResolver= XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
            varpipeline=newCssResolverPipeline(cssResolver, newHtmlPipeline(htmlContext, newElementHandlerPipeline(XhtmlHelper, null)));//Here's where we add our IElementHandler//The worker dispatches commands to the pipeline stuff abovevarworker=newXMLWorker(pipeline, true);

            //Create a parser with the worker listed as the dispatchervarparser=newXMLParser();
            parser.AddListener(worker);

            //Finally, parse our HTML directly.
            using (TextReadersr=newStringReader(html)) {
                parser.Parse(sr);
            }

            //The above did not touch our document. Instead, all "proper" elements are stored in our helper class XhtmlHelper
            foreach (var element in XhtmlHelper.elements) {
                //Add these to the main document
                doc.Add(element);
            }

            doc.Close();

        }
    }
}

Post a Comment for "Skip Adding Empty Tables To Pdf When Parsing Xhtml Using Itextsharp"