Convert Amazon transaction data for import into accounting system The 2019 Stack Overflow Developer Survey Results Are InConvert total minutes into “Hours:minute:seconds” for marathon dataOptimizing for data import in Neo4j using py2neoPython CSV command line tool to import CSV into MySQLImport Excel, unpivot data, output tab-delimited for SQL ImportProcessing eBay category list for database importPython command-line program to convert genomic data fileConvert data into dictionarySearching for data from file1 in file2System that manages employee data for managersConvert *.VCF contact file for cellphone import
Keeping a retro style to sci-fi spaceships?
Why doesn't shell automatically fix "useless use of cat"?
APIPA and LAN Broadcast Domain
Are there any other methods to apply to solving simultaneous equations?
Are turbopumps lubricated?
Is Cinnamon a desktop environment or a window manager? (Or both?)
How to obtain a position of last non-zero element
How to type a long/em dash `—`
Can there be female White Walkers?
Likelihood that a superbug or lethal virus could come from a landfill
What do hard-Brexiteers want with respect to the Irish border?
Why not take a picture of a closer black hole?
Is an up-to-date browser secure on an out-of-date OS?
Correct punctuation for showing a character's confusion
Ubuntu Server install with full GUI
writing variables above the numbers in tikz picture
Is bread bad for ducks?
Why don't hard Brexiteers insist on a hard border to prevent illegal immigration after Brexit?
Inverse Relationship Between Precision and Recall
I am an eight letter word. What am I?
How to charge AirPods to keep battery healthy?
Why doesn't UInt have a toDouble()?
Did Scotland spend $250,000 for the slogan "Welcome to Scotland"?
How do I free up internal storage if I don't have any apps downloaded?
Convert Amazon transaction data for import into accounting system
The 2019 Stack Overflow Developer Survey Results Are InConvert total minutes into “Hours:minute:seconds” for marathon dataOptimizing for data import in Neo4j using py2neoPython CSV command line tool to import CSV into MySQLImport Excel, unpivot data, output tab-delimited for SQL ImportProcessing eBay category list for database importPython command-line program to convert genomic data fileConvert data into dictionarySearching for data from file1 in file2System that manages employee data for managersConvert *.VCF contact file for cellphone import
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
At the end of each billing cycle, Amazon generates a raw transaction file for my store's orders that cycle. I am converting that raw transaction file into a .csv file to be imported into my accounting software. My converter program takes two input files.
The first input, the Amazon data file, contains hundreds of lines, and looks like this:
settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased promotion-id
11774871501 2019-04-01 13:09:26 UTC 2019-04-07 14:54:57 UTC 2019-04-09 14:54:57 UTC 11591.38 USD
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Principal 29.95 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Shipping 4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees FBAPerUnitFulfillmentFee -3.19 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees Commission -4.49 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees ShippingChargeback -4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemPrice Principal 173.08 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees FBAPerUnitFulfillmentFee -9.02 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees Commission -25.96 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
…
The second input is a CSV file that stores the necessary accounting codes, which contains a few dozen lines, and looks like this:
combined-type,AccountCode,Description
Non-AmazonOrderItemFeesFBAPerUnitFulfillmentFee,501,Non-Amazon - Order - ItemFees - FBAPerUnitFulfillmentFee
Amazon.comOrderItemPricePrincipal,400,Amazon.com - Order - ItemPrice - Principal
Amazon.comOrderItemPriceShipping,403,Amazon.com - Order - ItemPrice - Shipping
Amazon.comOrderItemPriceShippingTax,202,Amazon.com - Order - ItemPrice - ShippingTax
…
I used Python 3 and Pandas. Can you help me make this code bulletproof? My concern is the code does not catch errors and/or does not execute my intent.
Landmark 1:
Read in the settlement file (raw transaction file).
Read in the account codes file to match transactions to accounting codes.
Landmark 2:
Clean up the settlement file by collecting the summary data and converting the four columns into a combined column for comparison and matching to the account codes file.
Landmark 3:
Find the matching account code for each transaction and sum the total. So now we should have a summary amount for each account code.
Landmark 4:
Clean up the new combined dataframe into the proper template format for our accounting software. Export new file as .csv for uploading into accounting software.
#!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
#Read in Amazon settlement - can be picked up by Python script later
az_data = pd.read_csv('/Users/XXX/Desktop/az_data.txt', sep='t', header=0, parse_dates=['settlement-start-date', 'settlement-end-date'])
df = pd.DataFrame(az_data)
#Read in Account codes - this can be SQL storage later
acct_codes = pd.read_csv('/Users/XXX/Desktop/acct_codes.csv', sep=',', header=0)
df_accts = pd.DataFrame(acct_codes)
#Take summary data from first row of Amazon settlement and use as check-data
settlement_id = df.iloc[0,0]
settlement_start_date = df.iloc[0,1]
settlement_end_date = df.iloc[0,2]
deposit_date = df.iloc[0,3]
invoice_total = df.iloc[0,4]
#Drop summary row as it is no longer needed and doesn't match
df.drop(df.index[0], inplace=True)
#Replace blank values in 'marketplace-name' column with 'alt-transaction' so groupby doesn't skip those values. And replace all other blank values with 'NA' value
fillvalues = 'marketplace-name': 'alt-transaction'
df.fillna(value=fillvalues, inplace=True)
df.fillna('NA', inplace=True)
#Create combined column to use as a key
df['combined-type'] = df['marketplace-name'] + df['transaction-type'] + df['amount-type']+ df['amount-description']
#Groupby combined column and take sum of categories
df_mod = df.groupby(['combined-type'])[['amount']].sum()
#Merge dataframes to get account codes and descriptions
df_results = df_mod.merge(df_accts, on='combined-type', how='left')
#Drop row from un-used account
df_results = df_results[df_results['combined-type'] != 'Non-AmazonOrderItemPricePrincipal']
#Rename columns to match Xero template
df_results.rename(columns='amount':'UnitAmount', inplace=True)
#Drop the now un-needed combined-type column
df_results.drop(columns=['combined-type'], inplace=True)
#Add invoice template columns with data
df_results['ContactName'] = 'Amazon.com'
df_results['InvoiceNumber'] = 'INV_' + str(settlement_id)
df_results['Reference'] = 'AZ_Xero_Py_' + str(settlement_id)
df_results['InvoiceDate'] = settlement_start_date
df_results['DueDate'] = settlement_end_date
df_results['Quantity'] = 1
df_results['Currency'] = 'USD'
df_results['TaxType'] = 'Tax on Sales'
df_results['TaxAmount'] = 0
df_results['TrackingName1'] = 'Channel'
df_results['TrackingOption1'] = 'Amazon'
#Re-order columns to match template
all_column_list = ['ContactName','EmailAddress','POAddressLine1','POAddressLine2','POAddressLine3','POAddressLine4','POCity','PORegion','POPostalCode','POCountry','InvoiceNumber','Reference','InvoiceDate','DueDate','InventoryItemCode','Description','Quantity','UnitAmount','Discount','AccountCode','TaxType','TrackingName1','TrackingOption1','TrackingName2','TrackingOption2','Currency','BrandingTheme']
df_final = df_results.reindex(columns=all_column_list)
print (df_final.dtypes)
#Re-format the datetimes to be dates
df_final['InvoiceDate'] = df_final['InvoiceDate'].dt.date
df_final['DueDate'] = df_final['DueDate'].dt.date
#Export final df to csv
df_final.to_csv('/Users/XXX/Desktop/invoice' + str(settlement_id) + '.csv', index=False)
python python-3.x csv pandas e-commerce
New contributor
$endgroup$
add a comment |
$begingroup$
At the end of each billing cycle, Amazon generates a raw transaction file for my store's orders that cycle. I am converting that raw transaction file into a .csv file to be imported into my accounting software. My converter program takes two input files.
The first input, the Amazon data file, contains hundreds of lines, and looks like this:
settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased promotion-id
11774871501 2019-04-01 13:09:26 UTC 2019-04-07 14:54:57 UTC 2019-04-09 14:54:57 UTC 11591.38 USD
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Principal 29.95 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Shipping 4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees FBAPerUnitFulfillmentFee -3.19 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees Commission -4.49 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees ShippingChargeback -4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemPrice Principal 173.08 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees FBAPerUnitFulfillmentFee -9.02 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees Commission -25.96 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
…
The second input is a CSV file that stores the necessary accounting codes, which contains a few dozen lines, and looks like this:
combined-type,AccountCode,Description
Non-AmazonOrderItemFeesFBAPerUnitFulfillmentFee,501,Non-Amazon - Order - ItemFees - FBAPerUnitFulfillmentFee
Amazon.comOrderItemPricePrincipal,400,Amazon.com - Order - ItemPrice - Principal
Amazon.comOrderItemPriceShipping,403,Amazon.com - Order - ItemPrice - Shipping
Amazon.comOrderItemPriceShippingTax,202,Amazon.com - Order - ItemPrice - ShippingTax
…
I used Python 3 and Pandas. Can you help me make this code bulletproof? My concern is the code does not catch errors and/or does not execute my intent.
Landmark 1:
Read in the settlement file (raw transaction file).
Read in the account codes file to match transactions to accounting codes.
Landmark 2:
Clean up the settlement file by collecting the summary data and converting the four columns into a combined column for comparison and matching to the account codes file.
Landmark 3:
Find the matching account code for each transaction and sum the total. So now we should have a summary amount for each account code.
Landmark 4:
Clean up the new combined dataframe into the proper template format for our accounting software. Export new file as .csv for uploading into accounting software.
#!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
#Read in Amazon settlement - can be picked up by Python script later
az_data = pd.read_csv('/Users/XXX/Desktop/az_data.txt', sep='t', header=0, parse_dates=['settlement-start-date', 'settlement-end-date'])
df = pd.DataFrame(az_data)
#Read in Account codes - this can be SQL storage later
acct_codes = pd.read_csv('/Users/XXX/Desktop/acct_codes.csv', sep=',', header=0)
df_accts = pd.DataFrame(acct_codes)
#Take summary data from first row of Amazon settlement and use as check-data
settlement_id = df.iloc[0,0]
settlement_start_date = df.iloc[0,1]
settlement_end_date = df.iloc[0,2]
deposit_date = df.iloc[0,3]
invoice_total = df.iloc[0,4]
#Drop summary row as it is no longer needed and doesn't match
df.drop(df.index[0], inplace=True)
#Replace blank values in 'marketplace-name' column with 'alt-transaction' so groupby doesn't skip those values. And replace all other blank values with 'NA' value
fillvalues = 'marketplace-name': 'alt-transaction'
df.fillna(value=fillvalues, inplace=True)
df.fillna('NA', inplace=True)
#Create combined column to use as a key
df['combined-type'] = df['marketplace-name'] + df['transaction-type'] + df['amount-type']+ df['amount-description']
#Groupby combined column and take sum of categories
df_mod = df.groupby(['combined-type'])[['amount']].sum()
#Merge dataframes to get account codes and descriptions
df_results = df_mod.merge(df_accts, on='combined-type', how='left')
#Drop row from un-used account
df_results = df_results[df_results['combined-type'] != 'Non-AmazonOrderItemPricePrincipal']
#Rename columns to match Xero template
df_results.rename(columns='amount':'UnitAmount', inplace=True)
#Drop the now un-needed combined-type column
df_results.drop(columns=['combined-type'], inplace=True)
#Add invoice template columns with data
df_results['ContactName'] = 'Amazon.com'
df_results['InvoiceNumber'] = 'INV_' + str(settlement_id)
df_results['Reference'] = 'AZ_Xero_Py_' + str(settlement_id)
df_results['InvoiceDate'] = settlement_start_date
df_results['DueDate'] = settlement_end_date
df_results['Quantity'] = 1
df_results['Currency'] = 'USD'
df_results['TaxType'] = 'Tax on Sales'
df_results['TaxAmount'] = 0
df_results['TrackingName1'] = 'Channel'
df_results['TrackingOption1'] = 'Amazon'
#Re-order columns to match template
all_column_list = ['ContactName','EmailAddress','POAddressLine1','POAddressLine2','POAddressLine3','POAddressLine4','POCity','PORegion','POPostalCode','POCountry','InvoiceNumber','Reference','InvoiceDate','DueDate','InventoryItemCode','Description','Quantity','UnitAmount','Discount','AccountCode','TaxType','TrackingName1','TrackingOption1','TrackingName2','TrackingOption2','Currency','BrandingTheme']
df_final = df_results.reindex(columns=all_column_list)
print (df_final.dtypes)
#Re-format the datetimes to be dates
df_final['InvoiceDate'] = df_final['InvoiceDate'].dt.date
df_final['DueDate'] = df_final['DueDate'].dt.date
#Export final df to csv
df_final.to_csv('/Users/XXX/Desktop/invoice' + str(settlement_id) + '.csv', index=False)
python python-3.x csv pandas e-commerce
New contributor
$endgroup$
$begingroup$
Welcome to Code Review! I hope you'll get some great feedback.
$endgroup$
– Alex
3 hours ago
$begingroup$
Thanks @Alex , I am looking forward to learning!
$endgroup$
– tulanejosh
3 hours ago
add a comment |
$begingroup$
At the end of each billing cycle, Amazon generates a raw transaction file for my store's orders that cycle. I am converting that raw transaction file into a .csv file to be imported into my accounting software. My converter program takes two input files.
The first input, the Amazon data file, contains hundreds of lines, and looks like this:
settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased promotion-id
11774871501 2019-04-01 13:09:26 UTC 2019-04-07 14:54:57 UTC 2019-04-09 14:54:57 UTC 11591.38 USD
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Principal 29.95 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Shipping 4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees FBAPerUnitFulfillmentFee -3.19 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees Commission -4.49 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees ShippingChargeback -4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemPrice Principal 173.08 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees FBAPerUnitFulfillmentFee -9.02 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees Commission -25.96 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
…
The second input is a CSV file that stores the necessary accounting codes, which contains a few dozen lines, and looks like this:
combined-type,AccountCode,Description
Non-AmazonOrderItemFeesFBAPerUnitFulfillmentFee,501,Non-Amazon - Order - ItemFees - FBAPerUnitFulfillmentFee
Amazon.comOrderItemPricePrincipal,400,Amazon.com - Order - ItemPrice - Principal
Amazon.comOrderItemPriceShipping,403,Amazon.com - Order - ItemPrice - Shipping
Amazon.comOrderItemPriceShippingTax,202,Amazon.com - Order - ItemPrice - ShippingTax
…
I used Python 3 and Pandas. Can you help me make this code bulletproof? My concern is the code does not catch errors and/or does not execute my intent.
Landmark 1:
Read in the settlement file (raw transaction file).
Read in the account codes file to match transactions to accounting codes.
Landmark 2:
Clean up the settlement file by collecting the summary data and converting the four columns into a combined column for comparison and matching to the account codes file.
Landmark 3:
Find the matching account code for each transaction and sum the total. So now we should have a summary amount for each account code.
Landmark 4:
Clean up the new combined dataframe into the proper template format for our accounting software. Export new file as .csv for uploading into accounting software.
#!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
#Read in Amazon settlement - can be picked up by Python script later
az_data = pd.read_csv('/Users/XXX/Desktop/az_data.txt', sep='t', header=0, parse_dates=['settlement-start-date', 'settlement-end-date'])
df = pd.DataFrame(az_data)
#Read in Account codes - this can be SQL storage later
acct_codes = pd.read_csv('/Users/XXX/Desktop/acct_codes.csv', sep=',', header=0)
df_accts = pd.DataFrame(acct_codes)
#Take summary data from first row of Amazon settlement and use as check-data
settlement_id = df.iloc[0,0]
settlement_start_date = df.iloc[0,1]
settlement_end_date = df.iloc[0,2]
deposit_date = df.iloc[0,3]
invoice_total = df.iloc[0,4]
#Drop summary row as it is no longer needed and doesn't match
df.drop(df.index[0], inplace=True)
#Replace blank values in 'marketplace-name' column with 'alt-transaction' so groupby doesn't skip those values. And replace all other blank values with 'NA' value
fillvalues = 'marketplace-name': 'alt-transaction'
df.fillna(value=fillvalues, inplace=True)
df.fillna('NA', inplace=True)
#Create combined column to use as a key
df['combined-type'] = df['marketplace-name'] + df['transaction-type'] + df['amount-type']+ df['amount-description']
#Groupby combined column and take sum of categories
df_mod = df.groupby(['combined-type'])[['amount']].sum()
#Merge dataframes to get account codes and descriptions
df_results = df_mod.merge(df_accts, on='combined-type', how='left')
#Drop row from un-used account
df_results = df_results[df_results['combined-type'] != 'Non-AmazonOrderItemPricePrincipal']
#Rename columns to match Xero template
df_results.rename(columns='amount':'UnitAmount', inplace=True)
#Drop the now un-needed combined-type column
df_results.drop(columns=['combined-type'], inplace=True)
#Add invoice template columns with data
df_results['ContactName'] = 'Amazon.com'
df_results['InvoiceNumber'] = 'INV_' + str(settlement_id)
df_results['Reference'] = 'AZ_Xero_Py_' + str(settlement_id)
df_results['InvoiceDate'] = settlement_start_date
df_results['DueDate'] = settlement_end_date
df_results['Quantity'] = 1
df_results['Currency'] = 'USD'
df_results['TaxType'] = 'Tax on Sales'
df_results['TaxAmount'] = 0
df_results['TrackingName1'] = 'Channel'
df_results['TrackingOption1'] = 'Amazon'
#Re-order columns to match template
all_column_list = ['ContactName','EmailAddress','POAddressLine1','POAddressLine2','POAddressLine3','POAddressLine4','POCity','PORegion','POPostalCode','POCountry','InvoiceNumber','Reference','InvoiceDate','DueDate','InventoryItemCode','Description','Quantity','UnitAmount','Discount','AccountCode','TaxType','TrackingName1','TrackingOption1','TrackingName2','TrackingOption2','Currency','BrandingTheme']
df_final = df_results.reindex(columns=all_column_list)
print (df_final.dtypes)
#Re-format the datetimes to be dates
df_final['InvoiceDate'] = df_final['InvoiceDate'].dt.date
df_final['DueDate'] = df_final['DueDate'].dt.date
#Export final df to csv
df_final.to_csv('/Users/XXX/Desktop/invoice' + str(settlement_id) + '.csv', index=False)
python python-3.x csv pandas e-commerce
New contributor
$endgroup$
At the end of each billing cycle, Amazon generates a raw transaction file for my store's orders that cycle. I am converting that raw transaction file into a .csv file to be imported into my accounting software. My converter program takes two input files.
The first input, the Amazon data file, contains hundreds of lines, and looks like this:
settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased promotion-id
11774871501 2019-04-01 13:09:26 UTC 2019-04-07 14:54:57 UTC 2019-04-09 14:54:57 UTC 11591.38 USD
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Principal 29.95 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemPrice Shipping 4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees FBAPerUnitFulfillmentFee -3.19 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees Commission -4.49 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 111-3282062-5204245 111-3282062-5204245 DW6NY7djJ Amazon.com ItemFees ShippingChargeback -4.42 AFN 2019-04-03 2019-04-03 17:13:29 UTC 04346910081818 D0-FMT7-C3G9 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemPrice Principal 173.08 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees FBAPerUnitFulfillmentFee -9.02 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
11774871501 Order 114-8130626-1298654 114-8130626-1298654 D7RCVz0SP Amazon.com ItemFees Commission -25.96 AFN 2019-04-03 2019-04-03 22:32:57 UTC 50221749590266 E6-0OOH-4ASK 1
…
The second input is a CSV file that stores the necessary accounting codes, which contains a few dozen lines, and looks like this:
combined-type,AccountCode,Description
Non-AmazonOrderItemFeesFBAPerUnitFulfillmentFee,501,Non-Amazon - Order - ItemFees - FBAPerUnitFulfillmentFee
Amazon.comOrderItemPricePrincipal,400,Amazon.com - Order - ItemPrice - Principal
Amazon.comOrderItemPriceShipping,403,Amazon.com - Order - ItemPrice - Shipping
Amazon.comOrderItemPriceShippingTax,202,Amazon.com - Order - ItemPrice - ShippingTax
…
I used Python 3 and Pandas. Can you help me make this code bulletproof? My concern is the code does not catch errors and/or does not execute my intent.
Landmark 1:
Read in the settlement file (raw transaction file).
Read in the account codes file to match transactions to accounting codes.
Landmark 2:
Clean up the settlement file by collecting the summary data and converting the four columns into a combined column for comparison and matching to the account codes file.
Landmark 3:
Find the matching account code for each transaction and sum the total. So now we should have a summary amount for each account code.
Landmark 4:
Clean up the new combined dataframe into the proper template format for our accounting software. Export new file as .csv for uploading into accounting software.
#!/usr/bin/env python3.6
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
#Read in Amazon settlement - can be picked up by Python script later
az_data = pd.read_csv('/Users/XXX/Desktop/az_data.txt', sep='t', header=0, parse_dates=['settlement-start-date', 'settlement-end-date'])
df = pd.DataFrame(az_data)
#Read in Account codes - this can be SQL storage later
acct_codes = pd.read_csv('/Users/XXX/Desktop/acct_codes.csv', sep=',', header=0)
df_accts = pd.DataFrame(acct_codes)
#Take summary data from first row of Amazon settlement and use as check-data
settlement_id = df.iloc[0,0]
settlement_start_date = df.iloc[0,1]
settlement_end_date = df.iloc[0,2]
deposit_date = df.iloc[0,3]
invoice_total = df.iloc[0,4]
#Drop summary row as it is no longer needed and doesn't match
df.drop(df.index[0], inplace=True)
#Replace blank values in 'marketplace-name' column with 'alt-transaction' so groupby doesn't skip those values. And replace all other blank values with 'NA' value
fillvalues = 'marketplace-name': 'alt-transaction'
df.fillna(value=fillvalues, inplace=True)
df.fillna('NA', inplace=True)
#Create combined column to use as a key
df['combined-type'] = df['marketplace-name'] + df['transaction-type'] + df['amount-type']+ df['amount-description']
#Groupby combined column and take sum of categories
df_mod = df.groupby(['combined-type'])[['amount']].sum()
#Merge dataframes to get account codes and descriptions
df_results = df_mod.merge(df_accts, on='combined-type', how='left')
#Drop row from un-used account
df_results = df_results[df_results['combined-type'] != 'Non-AmazonOrderItemPricePrincipal']
#Rename columns to match Xero template
df_results.rename(columns='amount':'UnitAmount', inplace=True)
#Drop the now un-needed combined-type column
df_results.drop(columns=['combined-type'], inplace=True)
#Add invoice template columns with data
df_results['ContactName'] = 'Amazon.com'
df_results['InvoiceNumber'] = 'INV_' + str(settlement_id)
df_results['Reference'] = 'AZ_Xero_Py_' + str(settlement_id)
df_results['InvoiceDate'] = settlement_start_date
df_results['DueDate'] = settlement_end_date
df_results['Quantity'] = 1
df_results['Currency'] = 'USD'
df_results['TaxType'] = 'Tax on Sales'
df_results['TaxAmount'] = 0
df_results['TrackingName1'] = 'Channel'
df_results['TrackingOption1'] = 'Amazon'
#Re-order columns to match template
all_column_list = ['ContactName','EmailAddress','POAddressLine1','POAddressLine2','POAddressLine3','POAddressLine4','POCity','PORegion','POPostalCode','POCountry','InvoiceNumber','Reference','InvoiceDate','DueDate','InventoryItemCode','Description','Quantity','UnitAmount','Discount','AccountCode','TaxType','TrackingName1','TrackingOption1','TrackingName2','TrackingOption2','Currency','BrandingTheme']
df_final = df_results.reindex(columns=all_column_list)
print (df_final.dtypes)
#Re-format the datetimes to be dates
df_final['InvoiceDate'] = df_final['InvoiceDate'].dt.date
df_final['DueDate'] = df_final['DueDate'].dt.date
#Export final df to csv
df_final.to_csv('/Users/XXX/Desktop/invoice' + str(settlement_id) + '.csv', index=False)
python python-3.x csv pandas e-commerce
python python-3.x csv pandas e-commerce
New contributor
New contributor
edited 2 hours ago
200_success
131k17157422
131k17157422
New contributor
asked 3 hours ago
tulanejoshtulanejosh
162
162
New contributor
New contributor
$begingroup$
Welcome to Code Review! I hope you'll get some great feedback.
$endgroup$
– Alex
3 hours ago
$begingroup$
Thanks @Alex , I am looking forward to learning!
$endgroup$
– tulanejosh
3 hours ago
add a comment |
$begingroup$
Welcome to Code Review! I hope you'll get some great feedback.
$endgroup$
– Alex
3 hours ago
$begingroup$
Thanks @Alex , I am looking forward to learning!
$endgroup$
– tulanejosh
3 hours ago
$begingroup$
Welcome to Code Review! I hope you'll get some great feedback.
$endgroup$
– Alex
3 hours ago
$begingroup$
Welcome to Code Review! I hope you'll get some great feedback.
$endgroup$
– Alex
3 hours ago
$begingroup$
Thanks @Alex , I am looking forward to learning!
$endgroup$
– tulanejosh
3 hours ago
$begingroup$
Thanks @Alex , I am looking forward to learning!
$endgroup$
– tulanejosh
3 hours ago
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
tulanejosh is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f217287%2fconvert-amazon-transaction-data-for-import-into-accounting-system%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
tulanejosh is a new contributor. Be nice, and check out our Code of Conduct.
tulanejosh is a new contributor. Be nice, and check out our Code of Conduct.
tulanejosh is a new contributor. Be nice, and check out our Code of Conduct.
tulanejosh is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f217287%2fconvert-amazon-transaction-data-for-import-into-accounting-system%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Welcome to Code Review! I hope you'll get some great feedback.
$endgroup$
– Alex
3 hours ago
$begingroup$
Thanks @Alex , I am looking forward to learning!
$endgroup$
– tulanejosh
3 hours ago