CSV to PDF – Finxter

0
144
CSV to PDF – Finxter


Story: This series of articles assume you are an employee of the City of Sacramento’s IT Department.

At the end of each month, a CSV file is sent to the Chief of Police outlining the crime statistics for the current month.

However, the Chief prefers to view the output in a styled PDF format based on a District/Beat of his choosing.

Part 2 of this series is a continuation of Part 1 and focuses on:

  • reading in the CSV created in Part 1
  • modifying the column headings for clarity
  • creating and formatting a PDF from the data
  • saving the PDF

💡 Note: Before continuing you must have the appropriate CSV file created in Part 1.


Getting Started

Remember to add the Required Starter Code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

Required Starter Code

import pandas as pd
from fpdf import FPDF
import csv
import datetime
import yagmail

Before any data manipulation can occur, three (3) new libraries will require installation.

  • The pandas library enables access to/from a DataFrame.
  • The fpdf library enables the conversion of a CSV file to a professionally formatted PDF file.
  • The yagmail library is a GMAIL/SMTP client that makes it easy to send emails.

To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

$ pip install fpdf

Hit the <Enter> key on the keyboard to start the installation process.

$ pip install yagmail

Hit the <Enter> key on the keyboard to start the installation process.


Read in CSV to a List

In preparation for converting the CSV file to a PDF, this code snippet reads in the CSV file (3C.csv) created in Part 1 and located in the current working directory.

with open(f'{rpt_num}.csv', 'r') as csvfile:
    data_list= list(csv.reader(csvfile))[1:]
print(data_list[:3]
  • Line [1] opens the CSV in read (r) mode created in Part 1 (3C.csv) located in the current working directory.
  • Line [2] reads in all rows (excluding header row) and converts the rows to a list. This output saves to data_list.
  • Line [3] outputs the top three (3) rows from data_list to the terminal.

Output:

[['0', '1/1/06 0:00', '3108 Occidental Dr', '1115', '10851(A)Vc Take Veh W/O Owner'],
['13', '1/1/06 0:00', '7721 College Town Dr', '888', '530.5 Pc Use Personal Id Info'],
['7', '1/1/06 0:00', '3547 P St', '853', '484 Pc Petty Theft/Inside']]

💡 Note: Before continuing you may want to remove Line [3].


All documents from the City of Sacramento require the addition of their logo. To save the logo to the current working directory, perform the following steps:

  • With your mouse, hover over the logo shown below.
  • View the pop-up menu by performing a right-mouse click.
  • Click to select Save image as from the pop-up menu.
  • From the Save As dialog box, perform the following steps:
    • Browse to select the current working directory.
    • Enter the filename sacramento_logo.
    • Click the Save button.

Set PDF Variables

This code snippet declares several variables that the code will use in the creation of the PDF.

pdf_name = f'{rpt_num}.pdf'
rpt_hdgs = ['Row #', 'Date/Time', 'Address', 'Grid', 'Description'] 
cwidths  = [20, 40, 50, 30, 55]
line_height = 7
  • Line [1] creates a filename based on the variable rpt_num from Part 1. The extension ‘.pdf‘ is appended. The output saves to pdf_name (example: 3C.pdf).
  • Line [2] creates a list of new Column Headings. This output saves to rpt_hdgs.
  • Line [3] creates a list of column widths to use in the PDF.
  • Line [4] assigns a line height for row spacing in the PDF.

Create PDF Class

In this section, a new class is declared. This class contains header and footer functions to style the PDF.

class PDF(FPDF):

Inside the PDF Class, indent and add in the following code. This code styles the header and displays it on each PDF page.

    def header(self):
        today         = datetime.date.today()
        date_fmt      = today.strftime("%B" " " "%d" ", " "%Y")
        self.l_margin = 6
        self.r_margin = 6
        
        self.set_font("Arial", "", 7)
        self.image("sacramento_logo.png", 10, 8, 36)
        
        self.cell(80)
        self.set_font("Arial", "", 11)
        self.set_text_color(43,60,102)
        self.cell(30, 3, f"District/Beat: {rpt_num}", 0, 0, "C")

        self.set_font("Arial", "", 7)
        self.cell(-30, 11, f"{date_fmt}", 0, 0, "C")
        self.ln(12)

        self.set_fill_color(240,248,255)

        col = 0
        while col < len(rpt_hdgs):
            col_width = cwidths[col]
            self.cell(col_width, line_height, rpt_hdgs[col], 0, 0, fill=True)    
            col += 1   
        self.ln(12)    
  • Line [1] declares the header function.
  • Line [2-5] sets the current date and margins for the PDF.
  • Line [6] sets the font name and size.
  • Line [7] sets the image file to display in the top left corner of the PDF.
  • Line [8] sets the position of the PDF heading from the left margin.
  • Line [9-11] sets the font name, size, and offset of the heading from the top margin.
  • Line [12-13] sets the font name and size, and offset of the current date from the top margin.
  • Line [14] sets the offset from the current date to the next row (table header row).
  • Line [15] sets the background color for the header row.
  • Line [16] sets a counter variable col for the while loop.
  • Line [17-20] creates a while loop that displays the header row columns using the variable rpt_hdgs.
  • Line [21] sets the offset from the heading row to the next row.

The above header code styles the PDF header as shown below.

Inside the PDF Class, indent and add in the following code. This code styles the footer and displays it on each PDF page.

    def footer(self):
        self.set_font("Arial", "I", 7)
        self.set_fill_color(240,248,255)
        self.cell(0, line_height, "Report Page " + str(self.page_no()) + "/{nb}", 0, 0, "C", fill=True)
  • Line [1] declares the footer function.
  • Line [2] sets the font, style and size for the footer.
  • Line [3] sets the background color for the footer row.
  • Line [4] sets the placement and footer text for each page.

Output:

The above footer code styles the PDF footer as shown below.


Put Data in PDF

Now that the header and footer code has been created, we need code to display the data from the data_lst created in Part 1.

def convert_to_pdf(data_list):
    pdf = PDF()
    pdf.alias_nb_pages()
    pdf.add_page()
    pdf.set_font("Arial", "", 7)

    row_count = 0
    while row_count < len(data_list):
        col = 0
        for c in cwidths:
            pdf.cell(c, 0, data_list[row_count][col], align="L", border=0)
            col += 1
        pdf.ln(4)
        row_count += 1
    pdf.output(pdf_name, "F")
convert_to_pdf(data_list)
  • Line [1] declares a function with one (1) parameter, data_lst.
  • Line [2] declares an instance of the class PDF.
  • Line [3-5] sets up the PDF pages and font details for the data.
  • Line [6-14] instantiates a while loop to output the data to the PDF based on rows/page.
  • Line [15] sets the output to a PDF.
  • Line [16] calls the convert_to_pdf function with the appropriate parameter to create the PDF.

Output:

Depending on the report selected, the output will vary.


Create a PDF Complete Code

Below is the complete code from Part 1 and Part 2.

import pandas as pd  
from fpdf import FPDF
import csv
import datetime

cols = ['cdatetime', 'address', 'district', 'beat', 'grid', 'crimedescr']
df   = pd.read_csv('crimes.csv', usecols=cols)
df.sort_values('cdatetime', inplace=True, ascending=True)

df['beat'] = df['beat'].str.rstrip()
df = df.apply(lambda x: x.astype(str).str.title())

lst="123456ABCQ"
rpt_num = None

while True:
    rpt_num = input('Select a District/Beat (1A-6C or Q to quit): ').upper()
    if rpt_num == 'Q':
        exit()
    elif rpt_num[0] not in lst[0:6] or rpt_num[1] not in lst[6:9]:
        print('You entered an invalid selection!')
    else:
        break

print(f'Report {rpt_num} generating!')

the_filter = (df.query(f"beat == '{rpt_num}'"))
tot_recs  = the_filter.count()

filt_cols=['cdatetime','address','grid','crimedescr']
the_filter.to_csv(f'{rpt_num}.csv', columns=filt_cols)

print(f'Report {rpt_num}.csv is in the current working directory!')

with open(f'{rpt_num}.csv', 'r') as csvfile:
    data_list = list(csv.reader(csvfile))[1:]

pdf_name = f'{rpt_num}.pdf'
rpt_hdgs = ['Row #', 'Date/Time', 'Address', 'Grid', 'Description'] 
cwidths  = [20, 40, 50, 30, 55]
line_height = 7

class PDF(FPDF):
    def header(self):
        today         = datetime.date.today()
        date_fmt      = today.strftime("%B" " " "%d" ", " "%Y")
        self.l_margin = 6
        self.r_margin = 6
        
        self.set_font("Arial", "", 7)
        self.image("sacramento_logo.png", 10, 8, 36)
        
        self.cell(80)
        self.set_font("Arial", "", 11)
        self.set_text_color(43,60,102)
        self.cell(30, 3, f"District/Beat: {rpt_num}", 0, 0, "C")

        self.set_font("Arial", "", 7)
        self.cell(-30, 11, f"{date_fmt}", 0, 0, "C")
        self.ln(12)

        self.set_fill_color(240,248,255)

        col = 0
        while col < len(rpt_hdgs):
            col_width = cwidths[col]
            self.cell(col_width, line_height, rpt_hdgs[col], 0, 0, fill=True)    
            col += 1   
        self.ln(12)    

    def footer(self):
        # self.set_y(-15)
        self.set_font("Arial", "I", 7)
        self.set_fill_color(240,248,255)
        self.cell(0, line_height, "Report Page " + str(self.page_no()) + "/{nb}", 0, 0, "C", fill=True)

def convert_to_pdf(data_list):
    pdf = PDF()
    pdf.alias_nb_pages()
    pdf.add_page()
    pdf.set_font("Arial", "", 7)

    row_count = 0
    while row_count < len(data_list):
        col = 0
        for c in cwidths:
            pdf.cell(c, 0, data_list[row_count][col], align="L", border=0)
            col += 1
        pdf.ln(4)
        row_count += 1
    pdf.output(pdf_name, "F")
convert_to_pdf(data_list)

Summary

In this article, you learned how to:

  • Convert a CSV to a List.
  • Save an Image.
  • Create a Class.
  • Create functions within a Class.
  • Create and save a PDF file.

What’s Next

In Part 3 of this series, you will learn how to attach the PDF file to a Gmail account and send the email to a recipient.



Source link

Leave a reply

Please enter your comment!
Please enter your name here