The Bucket List - 2011 #3 - Develop a software tool
The PDF Maker which converts word documents to PDF documents.This post may be an absolute crap for non binaries (those who dont work with 0s and 1s) and may cause yawning for binaries who browse on the internet purely for pleasure.For others if it still causes yawning you may skip it.
The journey ...
Note:The roles given in the below conversation represent me only unless specific names are mentioned
The Project Leader: PDF documents are the most widely used printable documents on the web.But Adobe does not give any tool free of cost to create it.They provide reader for free but not a converter or a generator (both of them are priced.Hmmm.. good business strategy!).Let us develop a utility to create PDF documents.
Note:The roles given in the below conversation represent me only unless specific names are mentioned
The Project Leader: PDF documents are the most widely used printable documents on the web.But Adobe does not give any tool free of cost to create it.They provide reader for free but not a converter or a generator (both of them are priced.Hmmm.. good business strategy!).Let us develop a utility to create PDF documents.
The Designer: What is your exact requirement?
The Project Leader(draws on board): User Inputs Data --> PDF Generated by utility
The Developer:What technology can i use?
The Architect:Itext! Itext is a widely used library to generate pdf documents.You can even generate bar code and sign pdf documents using itext.You can use itext for free.It is open source
The Designer: Fine! Lets use itext.We shall get input from the user in a comma seperated file.Let our utility parse the data and generate pdf
(draws on board)
CSV --> utility(using itext) --> PDF
The Developer:But how will i know which data in csv is to be displayed in which location in the pdf?There should be some mapping file which provides this information
The Designer(ponders for some time):Fine we will have two xml files , one describing the format of csv and the other describing the format of pdf (draws on board)
CSV --> xml file --> utility(using itext) --> xml file --> PDF
The Developer:Looks fine.I shall develop some logic to map the data of CSV and PDF using these xml files.But hmmm what is the need of a CSV file here.Why cant the user provide the data directly in an xml file
(Developer and Designer brainstorm for some time)
The Designer:Fine we shall remove CSV part.Lets have xml file only as input.Let the user visualize a pdf document to be made up of many tables .(Draws on board)For example
Now he just needs to fill data in the columns..User needs to prepare an xml in this format(draws again)
He needs to place his data within column tags.Also within column tag we can provide attributes to choose font size,color,text alignment,text indentation etc
The Developer(ponders and replies): Ok this is possible to implement.I shall try it.But itext is too complicated.I need to store all data somewhere , probably java beans, and
then arrange the data manually on the pdf which is cumbersome.Can this be overcome?
then arrange the data manually on the pdf which is cumbersome.Can this be overcome?
The Architect: You can use Apache FOP in that case.Apache FOP(Formatting Object processor) is an
open source tool to parse xsl-fo files and convert them to pdf / html /.. files . Shouvik mentioned about this to me..That guy is really curious about technology
open source tool to parse xsl-fo files and convert them to pdf / html /.. files . Shouvik mentioned about this to me..That guy is really curious about technology
The Designer: Oh! thats fine..What is xsl-fo ? We only have xml as input
The Architect: XSL-FO is a language for formatting XML data for output to screen.You dont have to write the code to convert an xml to pdf.Just generate xsl-fo file from xml.Give this xsl-fo file to apache FOP.Apache FOP will generate the PDF for you!!So the flow will be like this (draws on board)
xml --> utility(to convert into xsl-fo) --> xsl-fo --> apache-fop -->pdf
The Developer: Looks complicated but interesting.I need to surf through the net.I shall let you know.Give me some time
The Architect and The Designer: Yeah Go on!
The Designer,Architect test it and find it to be good.
The Designer:Looks good!But find it difficult to place data in the correct column.It needs some visualization and trial and error tries.But anyways good work.
The Developer feels good and shows it to his team mate shubankar
Shubankar(after seeing and getting a brief description of the technical flow): But how will a dumb user know about xml?He should have a simple interface
The Developer got struck and decides to change the input again
The Developer(to Architect):I need to provide a simple interface.May be an editor where user can provide his data
The Architect: Abhishek told me about tinymce , an online editor used by facebook,microsoft and wordpress.You may check it out.But it can be used only for web applications.Not for your stand alone application.You may try with a word document.The user can provide a word document as input.You can convert word document to xsl-fo and your utility can then convert it to pdf using apache fop.
(draws on board)
document --> utility (to generate xsl-fo) -->xsl-fo --> apache fop --> pdf
The Developer : That is a word to pdf converter! Now how will i convert word to xsl-fo
The Architect: Luckily microsoft word has provision for that.Ask the user to save word document as an xml file.Now the flow remains the same as before
(draws on board)
xml --> utility(to convert into xsl-fo) --> xsl-fo -->apache-fop --> pdf
The Developer: But how will i convert the xml to xsl-fo.I dont know the structure of this xml.In the previous case i decided the format of the xml .So i could convert it to xsl-fo in my code.How can i do it here?
The Architect: XSLT!! eXtensible Stylesheet Language Transformation.Microsoft provides the stylesheet for word document saved as xml.So just provide the xml and xslt to your apache fop.It will generate xsl-fo.Again pass this to apache fop to generate pdf
The flow will be:
document --> xml & xslt --> apache fop --> xsl-fo --> apache fop (another method) --> pdf
The Developer:Okie! I will do it
Within few hours the developer completes the coding and goes home.At home his room mate kannan tests the code
Kannan: Your tool failed my testing! It does not work with 2007 microsoft word
The Debugger(after a little analysis) :Yeah! from 2007 version microsoft word provides support to save word document as pdf in their own reader.And the format of 2007 document is different from that of 2003.Whhoosh! All a waste of time??
The Blogger: No dude! Your objective was not to develop a tool which no one had developed before .It was only to develop and enjoy the process and ofcourse , check off one more item in your bucket list....
Appendix 1:
Revised bucket list:
See the Taj Mahal
Write atleast one chapter of my novel
Develop a software tool
Visit the United States of America
Develop six packs
Start rebuilding our family house
Have a conversation in hindi
Put on 5 kilos
Make a short video
Play a T20 or a T10 game
Cook a meal
See my close friend Richard achieve his dream
Create a painting
Learn guitar
Learn swimming
Watch a world cup match
Appendix 2:
Download the utility from this url:
http://sourceforge.net/projects/pdmak/
It contains a zip file.Unzip the file in any directory and click the jar file.Provide the xml file path (word document SAVED as xml (NOT RENAMED) with no option selected while saving (Word asks for the options 'Apply Transform' and 'Save data' while saving- dont select any of them)).Click on generate.Voila! PDF generated in the same path under the same name
Note: You need latest version of java installed on your machine
See the Taj Mahal
Write atleast one chapter of my novel
Develop a software tool
Visit the United States of America
Develop six packs
Start rebuilding our family house
Have a conversation in hindi
Put on 5 kilos
Make a short video
Play a T20 or a T10 game
Cook a meal
See my close friend Richard achieve his dream
Create a painting
Learn guitar
Learn swimming
Watch a world cup match
Appendix 2:
Download the utility from this url:
http://sourceforge.net/projects/pdmak/
It contains a zip file.Unzip the file in any directory and click the jar file.Provide the xml file path (word document SAVED as xml (NOT RENAMED) with no option selected while saving (Word asks for the options 'Apply Transform' and 'Save data' while saving- dont select any of them)).Click on generate.Voila! PDF generated in the same path under the same name
Note: You need latest version of java installed on your machine
Still having problems?? Raise your bugs on blogspot!!
Thursday, February 03, 2011 | Labels: Fun | 6 Comments
Subscribe to:
Posts (Atom)
Powered by Blogger.
Labels
birthday
(1)
Book review
(1)
Chennai garbage
(1)
Chennai heat
(2)
Chennai Tamil
(1)
Childhood memories
(1)
Choice
(1)
chris gayle
(1)
College Memories
(4)
Cricket
(1)
Finance
(1)
Food
(1)
Friends
(5)
Fun
(6)
Funny
(2)
General
(1)
Happiness
(1)
kuppathotti.com
(1)
Matharasapattinam
(6)
Mera Mumbai
(13)
Movie review
(4)
My Favourite Personalities
(3)
Old Mahabalipuram Road
(1)
OMR
(1)
Orissa Adventures
(7)
pallikaranai
(1)
Professional life
(5)
Seven of the week
(6)
Spiritual
(1)
tcs siruseri
(1)
Travelogue
(1)
Trips
(1)
- Mera Mumbai
- Orissa Adventures
- Fun
- Matharasapattinam
- Seven of the week
- Friends
- Professional life
- College Memories
- Movie review
- My Favourite Personalities
- Chennai heat
- Funny
- Book review
- Chennai Tamil
- Chennai garbage
- Childhood memories
- Choice
- Cricket
- Finance
- Food
- General
- Happiness
- OMR
- Old Mahabalipuram Road
- Spiritual
- Travelogue
- Trips
- birthday
- chris gayle
- kuppathotti.com
- pallikaranai
- tcs siruseri
Popular This Month
-
With regards to your unprecedented performance , Mr.Gayle we find 20 reasons why you are not eligibl...
-
Its 10 o clock in the morning.Carlos from Uruguay meets me at SIPCOT, Siruseri,Chennai and hands over a challen...
-
On my birthday : May 9 (27 years) first child Thank you every one who reminded me that i am not alo...
-
With my finance getting tight for the past few months, i decided to go frugal curbing my instincts to...
-
(May 5-11) Digitally dumped waste A year back when i came to chennai and settled at my place, i noticed its unusual garbage frien...


