************************************************ * Making IPUMS USA extracts via API from stata * * v 0.1 * * Renae Rodgers * ************************************************ ************************************************************************************************************************* * Before running this do file, you must * 1. be running Stata 16 or higher. * * 2. have a conda environment set up with the ipumspy library installed ( v.0.2.1 or higher). * It is highly recommended to set up a seperate conda environment for using ipumspy through stata rather than * installing ipumspy in the root conda environment. * * 3. have an IPUMS account and a microdata extract API key. ************************************************************************************************************************* * Set the python executable file to be the one in your ipumspy-containing conda environment * * This is the full path to the python executable inside the conda environment. * UPDATE THIS LINE WITH THE PATH TO YOUR PYTHON EXECUTABLE set python_exec C:\Users\rodge103\Miniconda3\envs\ipums3812\python.exe * check to make sure that worked * python query * save your API key in a macro global MY_API_KEY "YOUR API KEY HERE" **************************************************************************************** * PRO TIP! You can set your API key macro and python executable in a profile.do file * to avoid accidentally exposing credentials and to use the same python executable * for all activity within a Stata session! **************************************************************************************** * Now we're going to drop into python * python: # import necessary libraries import gzip import shutil from ipumspy import IpumsApiClient, UsaExtract from sfi import Macro # retrieve live api key from the global macro defined in user's profile.do file # this is the best substitute I have come up with for a conda environment variable # in a stata context my_api_key = Macro.getGlobal("MY_API_KEY") ipums = IpumsApiClient(my_api_key) # Define # 1. An IPUMS data collection # 2. A list of sample IDs # 3. A list of variables # 4. An extract description ipums_collection="usa" samples = ["us2012a"] variables = ["AGE", "SEX", "RELATE"] extract_description = "My first API extract!" # use all of this info to create a UsaExtract object extract = UsaExtract(samples, variables, description=extract_description) # submit your extract to the IPUMS extract system ipums.submit_extract(extract) # wait for the extract to finish ipums.wait_for_extract(extract, collection=ipums_collection) # when the extract is finished, download it to your current working directory ipums.download_extract(extract, stata_command_file=True) # store the extract project and extract id as local macros # for later use when reading in data Macro.setLocal("id", str(extract.extract_id).zfill(5)) Macro.setLocal("collection", extract.collection) # unzip the extract data file with gzip.open(f"{ipums_collection}_{str(extract.extract_id).zfill(5)}.dat.gz", 'rb') as f_in: with open(f"{ipums_collection}_{str(extract.extract_id).zfill(5)}.dat", 'wb') as f_out: shutil.copyfileobj(f_in, f_out) # exit python end * clear the python info* python clear * now we should see a data file and a ddi file in our current working directory ls * WORKED! * now we can run the do file and read the ipums extract into stata! qui do `collection'_`id'.do * check out the data! list age sex relate in 1/10 ****************************** * Your analysis starts here! * ******************************