RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/ropensci/comtradr/issues/45 below:

Planned re-launch of comtradr · Issue #45 · ropensci/comtradr · GitHub

Hi all!

the Comtrade API has been undergoing extensive changes. To accomodate these changes, comtradr will undergo a relaunch. Given that the most important functions are not working at the moment, this is a sensible moment to deprecate some and also update some of the used packages, e.g. from httr to httr2. Sadly, Chris will also leave the package as a maintainer, meaning there will also be a transition of maintainership in parallel.

To have an overview of needed changes, I will try to make a summary and put out some ideas that I would then propose to discuss in separate issues, so we can distribute efficiently. Please let me know, if I forgot something important (surely I did...).

Pinging everyone who wanted to help or is already a co-maintainer: @hgoers @dkolkin @pachadotdev

Authentication

Currently, the ct_register_token() function registers the token and does some extensive validation, depending whether there is a token and whether it is premium. The new API always requires a token. I propose that for now, we focus on only the options for the free API, as I personally do not have access to the premium API and it is quite expensive. I assume most users will not have access anyways. If somebody has access to the premium feature we can maybe extend for this option later?

I think we can probably just use two simple functions proposed in the httr2 package for setting and getting the token that are relatively user friendly, e.g. like here: https://github.com/datapumpernickel/untrader/blob/main/R/utils.R

If we have time, we can think about secrets and safer ways to store the credentials, but I think this suffices for now. I think there is no way to get nicely formatted data about how many calls are left on a given day of the 250 free calls, we could think about implementing something for that, but I would keep that for a future extension as well.

Getting data

Currently, most things happen in the ct_search() function, that is both checking the input parameters as well as building the request. The execution of the API request is already sourced out to another function, but this function also does some of the processing of the response. My proposal would be to further modularize this function to make it easier to trouble-shoot, make it easy to extend specific parts of it (e.g. adding new parameters) and work on it in parallel without conflicts.

The structure I started in another package for testing this out is the following:

ct_get_data()    
│
└───ct_params_check()
│   │    This function would call all subsequent functions in the relevant hierarchy and passes arguments between them (e.g. check 
|   |     classification code and url parameters before commodity code and pass it to cmd_code_check)
│   │
│   └─── ct_class_code_check()
|   |     The following functions should check the input parameters for validity and also translate them from human readable to 
|   |     to the machine-readable codes of the comtrade API
│   └─── ct_freq_check()
│   │  
│   └─── ct_cmd_code_check()
│   │
│   └─── ct_flow_direction_check()
│   │
│   └─── ct_reporter_check()
│   │
│   └─── ct_partner_check()
│   │
│   └─── >extendable for all parameters that we want to include or that are added in the future
|
└─── ct_request_build()
|   this builds the URL with the httr2 library
|
└───ct_request_perform()
|    this performs the request and catches and returns possible http errors, also easy implementation of retries and throttling with httr2
|
└─── ct_request_process()
    this will parse the response to an r data.frame, could also include 'pretty_cols = T' argument to allow for further processing

You can see it partially implemented in this example here: https://github.com/datapumpernickel/untrader/blob/main/R/check_params.R

I think this would make it very easy for us to extend the API in the future and at the same time make it easy to pinpoint errors to specific parameters. Lastly, it would probably make it also easier to write testthat statements for these functions. I am in no way set on any function names, these are just put as an example.

The previous, then unused functions we should keep in one file (e.g. legacy.R) and stick some "Deprecated" Error messages in there that help users transition to the new functions. See #44 for some hints and ideas by Maëlle on this.

Package Data Getting and saving the data

When verifying the input parameters we will need to access the predefined tables of the UN. There is a few functions currently implemented for saving and querying these, I guess we should see how much of it we can salvage and what could benefit from a re-write.
The full list of reference tables can be found here: https://comtradeapi.un.org/files/v1/app/reference/ListofReferences.json
It would probably make sense to write a function that queries all of these in some smart fashion and saves them in extdata. I think it would be good to make these available to the user as well and not keep them internal, at least I found myself frequently looking at them. I would need to familiarize myself with the best practices around sysdata/extdata and the other directories

Updating the reference tables on the user side

Currently there is a package function that updates the tables for the current session when users execute it ct_update_databases. I am not sure what are the specific guidelines of rOpenSci and CRAN for saving data like this temporarily and permanently, maybe it would make sense to do some research into how this could be solved best and then adjust to them. I assume @chris put some thought into this and we might be able to recycle large parts of this function.

Look-up functions

The package currently provides for two look-up functions ct_country_lookup.R and ct_commodity_lookup.R to make it easier to query the API in natural language.

For country codes:
- I would propose we abandon the writing out of country names, because it is very error prone. As there is no standardized way of writing country names, making this a constant annoyance (at least for me). Instead I propose we change to iso3 codes. These are standardized and would nicely play with a lot of other packages that specialize in transforming them back into plain english or other languages. Instead of having to lookup countries, e.g. one could just pass in a pre-existing list of EU-28 iso3c codes one got e.g. from giscoR for geodata.
For commodity codes:
- Although there is online resources for looking up HS codes that are much nicer than a function in R, this makes it more interactive and easy to not have to leave R to look up some Commodity Codes. I think we should keep these function and only update it where necessary.

Testing

We will have to write new tests for our new suite of functions. After reading some of the review for the package from last time, I suggest we write a lot of testthat statements with few arguments to easily pinpoint what is wrong with each test.
Furthermore, Maëlle suggested we implement testing for http-calls (see #44). I think this is a great idea, because it would allow us to check for the proper building of the http-request without passing one of our API tokens to github and figuring out how to make this pass the check for submitting to CRAN without an API token.

Badges

There is quite a few badges that are currently not working anymore or are failing. I honestly have no clue about most of these.
My naive suggestions would be to archive these for now, together with their respective yaml files and implement them again when we have gotten further with the new functions.

@ChrisMuir It would be great to get some guidance from you on this. Maybe we can make a slack-group with the new co-maintainers to discuss this internally.

Travis: I get an error message about Project not found or access denied when following the link: https://ci.appveyor.com/project/ropensci/comtradr

Codecov: Same here, something seems to have been archived, I am not sure what would be necessary to get it running again: https://app.codecov.io/gh/ropensci/comtradr

CRAN: obviously currently down, since we are off CRAN for now

Peer-reviewed: Although the package has been peer-reviewed, after the comprehensive re-launch it probably should not claim this badge until a re-review. Is there any policy on this @maelle?

Really looking forward to working with all of you on this!

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4