Download pages as PDF


Every now and then a customer requests that the pages in their website be downloadable as PDF and although it feels a bit “web 1.0” there are situations in which it makes sense. I guess there is still a mojority of people that like to hold a product sheet in their hands instead of viewing it on a screen. Nevertheless, I created a simple handler that will serve the page as pdf. To get the actual Pdf rendering done I used ABCPDF, which I found ultra simple to implement.

The ABCPDF wrapper class

First step is to create a wrapper class with a method to render any url as PDF and output the result to a file. By serving files we will able to add some caching to the solution and not have the overhead of processing a PDF for each page.

public static class AbcPdf
{
    public static void CreatePdf(string url, string fileName)
    {
        //Create a document and set the options
        var theDoc = new WebSupergoo.ABCpdf8.Doc();
        theDoc.HtmlOptions.Media = WebSupergoo.ABCpdf8.MediaType.Print;
        theDoc.HtmlOptions.RequestMethod = WebSupergoo.ABCpdf8.UrlRequestMethodType.Get;
        theDoc.HtmlOptions.Paged = true;
        theDoc.HtmlOptions.Engine = WebSupergoo.ABCpdf8.EngineType.Gecko;
        theDoc.HtmlOptions.UseScript = true;
        theDoc.Rect.Inset(50, 50);

        //Insert the url html (example of AbcPdf)
        int theID;
        theID = theDoc.AddImageUrl(url);
        while (true)
        {
            theDoc.FrameRect();
            if (!theDoc.Chainable(theID))
                break;
            theDoc.Page = theDoc.AddPage();
            theID = theDoc.AddImageToChain(theID);
        }
        for (int i = 1; i <= theDoc.PageCount; i++)
        {
            theDoc.PageNumber = i;
            theDoc.Flatten();
        }

        //Create a directory if needed and save the pdf
        System.IO.Directory.CreateDirectory(Path.GetDirectoryName(fileName));
        theDoc.Save(fileName);
        theDoc.Clear();
    }
}

The link between the CMS and ABCPDF

Next we create a class that will be the link between our CMS and the ABCPDF wrapper. It will simply get the page url based on a CMS page id using the CMS api; in this case EPiServer. It then compares the last publication date of the page in the CMS with the filetime of the last rendered pdf version of the page. If the file does not exist or is too old we create the PDF and return the file location.

public static class Pdfs
{
    private static string _pdfCachePath = ConfigurationManager.AppSettings["pdfcachePath"];

    public static string GetPdf(int pageId, out string pageName)
    {

        //Get the date and time of the last cached pdf if available
        var fileName = Path.Combine(_pdfCachePath, pageId.ToString()) +  ".pdf";
        var exists = File.Exists(fileName);
        DateTime? fileTime = null;
        if (exists)
        {
            var fi = new FileInfo(fileName);
            fileTime = fi.LastWriteTime;
        }

        //Get the pagedata and retrieve the name
                  //THIS IS WHERE YOU WOULD IMPLEMENT YOU OWN CMS LOGIC
        PageData page = DataFactory.Instance.GetPage(new PageReference(pageId));
        pageName = page.PageName;

        //If we have a filetime we get the page and see if the page is newer. 
                  //If so we will make a new pdf
        //If we have no filetime we always create the pdf
        var createPdf = false;
        if (fileTime == null)
        {
            createPdf = true;
        }
        else
        {
            //Get the page modfieid date and time
            var pageTime = page.Saved;
            if (pageTime > fileTime)
            {
                createPdf = true;
            }
        }

        //Create as requested using the url of the page;
        if (createPdf)
        {
            var url = page.GetExternalUrl();
            AbcPdf.CreatePdf(url, fileName);
        }

        //Return the filename
        return fileName;
    }
}

And ofcourse the handler

Finally we create the handler that will use the previous two classes to create the pdf file if needed and then serve it to the client. It receives the id to the CMS page and returns the PDF result file as bytes and with a content-disposition so the client can choose to open or save the PDF.

public class ServePdf : IHttpHandler
{
    public void ProcessRequest(HttpContext context)
    {
        //Get the page id which needs to be returned as pdf
        var pid = int.Parse(context.Request["pid"] ?? "0");

        //If correct: get the pdf file location and return the file-data in the response
        if(pid!=0)
        {
            string name;
            var file = Pdfs.GetPdf(pid, out name);
            context.Response.WriteFile(file);
            name = Utils.Strings.ToSlug(name) + ".pdf";
            context.Response.AddHeader("Content-disposition""attachment; filename=" + name);
        }
        else
        {
            context.Response.Status = "404 Not Found";
        }
        context.Response.ContentType = "application/pdf";
        context.Response.End();
    }

    public bool IsReusable
    {
        get
        {
            return false;
        }
    }
}

Conclusion

Clearly no rocket science, but if you do run into this request it might help get you up to speed. Of course there are other solution available besides ABCPDF and this is in no way a promotional post for that product. However, having tried both open source and payed products I think that the price of ABCPDF is by any means preferable above the hassle I encoutered using e.g. iTextSharp.

Posted in ASP.Net, EPiServer | Tagged , , , , , | Leave a comment