I started to track my expenses on my phone, and the app that I was using had a feature where I could export my expenses in CSV format to my dropbox. The idea was to create a dashboard using Power BI and Tableau so that I could visualize my spending better. To connect the data to dropbox, I decided to download the files to local storage. This was a manual process, something which I didn’t like. So I thought of automating that part and ended up writing this python script.
Please note that there is so much you could do using the dropbox API, but I am now focusing on my use case only. I may add more functionalities later.
- Download all the files from the given dropbox folder to a folder in my local storage.
- Before copying all the files, quickly check for the existing files in the local folder and only download the new files.
Creating a Dropbox App
Since we will be using the official Dropbox API, let’s create an app for our purpose. To create an app, you can simply go to this website, go to App Console on the top right and create an app.
Few things to note while creating the app:
- Make sure you create an app with access to full dropbox, not to a specific folder. This will help if you want to add more functionality to your application.
- Within permission make sure to check the appropriate permissions. In our case, since we need to download files from Dropbox I have provided “files.content.read” permission to my app.
- If you already generated a token before updating the permissions, you will need to generate a new token.
- Once you are okay with all the settings then generate the token. This will help you with the authentication.
Storing your token in a secure place
The token will be used within the python script and it is not advisable to share the token with others as well, so we need to figure out a way to keep our token secret and also be able to share our script. There are multiple ways to do that, I have opted to use a config file. The idea is simple, create a “config.ini” file where we will store our token and we can then simply utilize the “configparser” module within python to read that file.
You can store your token within the config file like this:
[LOGIN]
access_token = YourAccessToken
We can then read our token using Python:
# Importing Library
import configparser
# Reading the access token
config = configparser.ConfigParser()
config.read('config.ini')
access_token = config['LOGIN']['access_token']
# Output
>>> print(access_token)
YourAccessToken
Let’s initialize our Dropbox API
Let’s first connect to our dropbox account using the dropbox API and the access token
# Importing Library
import dropbox
from dropbox.exceptions import AuthError
# Connecting to dropbox
try:
dbx = dropbox.Dropbox(access_token)
except AuthError as e:
print(f'Error connecting to Dropbox using the access token: {e}')
- We connect to dropbox using the access token.
- We also handle any authentication error.
Let’s now get metadata for all the files inside the desired folder
""" Getting metadata of all the files inside the desired folder """
dropbox_folder_path = '/Apps/SpendingTracker/Exports'
files = dbx.files_list_folder(dropbox_folder_path).entries
files_list = []
for file in files:
if isinstance(file, dropbox.files.FileMetadata):
metadata = {
'name': file.name,
'path_display': file.path_display,
}
files_list.append(metadata)
- Here, we use the “files_list_folder” method to get all metadata about the files in a specific folder.
- We then store our metadata in a list of dictionaries.
Let’s save our metadata into a pandas DataFrame
""" Saving the metadata into a Pandas DataFrame """
# Importing Library
import pandas as pd
df = pd.DataFrame.from_records(files_list)
- Here, I simply save our data in a DataFrame.
- Our DataFrame has two columns: name and path_display, which can be used to download the files.
Let’s now download our files using the DataFrame
One thing to keep in mind is to check all the files which are already downloaded, so that we only download the new files.
# Importing Library
from os import listdir
# Checking all the downloaded files
local_folder_path = './spend data'
local_files = listdir(local_folder_path)
# Downloading files from Dropbox
for name, path in zip(df.loc[:, 'name'], df.loc[:, 'path_display']):
if name not in local_files:
metadata, result = dbx.files_download(path)
with open(f'{local_folder_path}/{name}', 'wb') as f:
f.write(result.content)
- We first get a list of all the files which are present in our local folder.
- Using that list, we only download the files which are not present in the local folder using the “files_download” method.
- I then write the file and save it in the local folder.
Let’s combine everything
Now we have all the pieces ready. Let’s combine them and create a python script which we can simply execute so that it downloads the new files from our dropbox folder.
Please note, we can use command line arguments to customize our script. But for now, I am not going to do that since I am targeting a specific problem which I want to solve using automation.
# Importing Library
import dropbox
from os import listdir
import pandas as pd
import configparser
from dropbox.exceptions import AuthError
# Variable
DROPBOX_FOLDER_PATH = '/Apps/SpendingTracker/Exports'
LOCAL_FOLDER_PATH = './spend data'
# Getting all local files
local_files = listdir(LOCAL_FOLDER_PATH)
# Getting the access token
config = configparser.ConfigParser()
config.read('config.ini')
access_token = config['LOGIN']['access_token']
# Initializing Dropbox API
def dropbox_connect():
"""Create a connection to Dropbox."""
try:
dbx = dropbox.Dropbox(access_token)
except AuthError as e:
print('Error connecting to Dropbox with access token: ' + str(e))
return dbx
def dropbox_list_files():
"""Return a Pandas dataframe of files in a given Dropbox folder path in the Apps directory.
"""
dbx = dropbox_connect()
try:
files = dbx.files_list_folder(DROPBOX_FOLDER_PATH).entries
files_list = []
for file in files:
if isinstance(file, dropbox.files.FileMetadata):
metadata = {
'name': file.name,
'path_display': file.path_display,
'client_modified': file.client_modified,
'server_modified': file.server_modified
}
files_list.append(metadata)
df = pd.DataFrame.from_records(files_list)
return dbx, df.sort_values(by='server_modified', ascending=False)
except Exception as e:
print('Error getting list of files from Dropbox: ' + str(e))
def dropbox_download_and_save_files():
""" Downloads and saves the files which don't already exist locally """
dbx, df = dropbox_list_files()
for name, path in zip(df.loc[:, 'name'], df.loc[:, 'path_display']):
if name not in local_files:
metadata, result = dbx.files_download(path)
with open(f'{LOCAL_FOLDER_PATH}/{name}', 'wb') as f:
f.write(result.content)
if __name__ == '__main__':
dropbox_download_and_save_files()
You can find the link to the above code on my Github repository as well (link to code).
Hopefully, this article works as a jumping pad for any Dropbox API you want to create. This also serves as a simple example of how you can use Python for automating your work. If you have any other ideas for automating or any other project, feel free to reach out to me at preetparmar@outlook.com
At the time of writing this article, I faced an issue where the Dropbox token resets every 4 hours. I found a related article here. The summary is that Dropbox is working on a new “short-lived” access token but you generate a token that doesn’t expire by updating the “Access token expiration” settings tab of the app’s info page. Unfortunately, I don’t see it under my settings page.
UPDATE: Updating my login code based on the new authentication method provided here.
# Create a connection to Dropbox
auth_flow = DropboxOAuth2FlowNoRedirect(key, secret)
authorize_url = auth_flow.start()
print("1. Go to: " + authorize_url)
print("2. Click \"Allow\" (you might have to log in first).")
print("3. Copy the authorization code.")
auth_code = input("Enter the authorization code here: ").strip()
try:
oauth_result = auth_flow.finish(auth_code)
dbx = dropbox.Dropbox(long_term_access_token)
dbx = dropbox.Dropbox(oauth2_access_token=oauth_result.access_token)
except Exception as e:
print('Error: %s' % (e,))
exit(1)
- Key and Secret parameters refer to the App Key and App Secret respectively. I saved those in my config file, and similar to the access token you can get those from the config file itself.
I have updated my GitHub repository to reflect this change.