Server to convert Profiles RNS VIVO JSON-LD into a simplified JSON representation
In addition to being a lot easier for developers to work with, the interface also features strong caching and failover support, in order to maximize performance for end users.
Quick summary
- This service converts Profiles (VIVO/RNS/ORNG) JSON-LD into a simplified JSON representation for a single person/profile. It accepts multiple identifier types and returns a compact JSON object suitable for client consumption.
Endpoint
- The PSGI app is defined in app.psgi. It accepts GET requests with one identifier parameter (one of the keys listed below). Responses are JSON (application/json) and the app supports JSONP via the
callbackparameter.
Identifier query parameters (as accepted by app.psgi)
FNO: FNO / FNO-like identifiers (example:anirvan.chatterjee@ucsf.edu)Person: Profiles internal Person ID (numeric)EmployeeID: Employee IDEPPN: eduPersonPrincipalName (will be mapped to UserName)ProfilesURLName: Pretty URL name (e.g.anirvan.chatterjee)ProfilesNodeID: Numeric Profiles node id; returns canonical profile URL directlyURL: A full Profiles URL (canonical / pretty / historicalProfileDetails.aspx)
Examples
-
PrettyURL (JSON):
curl "http://localhost:5000/?ProfilesURLName=anirvan.chatterjee&source=myapp"
-
FNO (JSON, force no cache):
curl "http://localhost:5000/?FNO=anirvan.chatterjee@ucsf.edu&cache=never&source=myapp"
-
ProfilesNodeID (JSONP):
curl "http://localhost:5000/?ProfilesNodeID=370974&callback=mycb&source=myapp"
Parameters and behavior
source(required): A free-text string identifying the caller (or send a Referer header). The app requires either asourceparameter or a referer to help track usage and contact callers if needed.cache:cache=fallback(default) |cache=always|cache=neverfallback(default): try cache first; if not available, fetch from upstream and cache; in case of upstream failure it may return recently expired cache (subject to the cache policy).always: always return cached data if available; do not fetch upstream (used to ensure fast, deterministic responses)never: never use cache (forces a live fetch), and increases the HTTP timeout to allow longer fetch time
timeout: number of seconds for the request to complete — this sets a softfinish_by_time_in_epoch_secondsthat lowers UA timeouts dynamicallycallback: JSONP support; when provided the response Content-Type becomestext/javascriptand the JSON is wrapped ascallback(JSON)
- The module uses
ProfilesEasyJSON::CHIfor caching. There are several separate namespaces used by the code (identifier -> canonical URL mapping, canonical URL -> JSON, raw URL fetch cache for ORNG gadget fields, and some position/name caches). - When upstream fetches fail, the code will attempt to fall back to expired cache entries if they exist and pass the cache policy check. The cache policy permits returning cached entries no older than 14 days (see
_verify_cache_object_policy). cache=alwayswill prevent upstream fetches and may return stale data;cache=neverforces live fetch but adjusts timeouts to allow slower upstream responses.
-
profilesdotjson.conf(optional): used to store secrets/RC4 password for decryptingemailEncryptedfields in Profiles RDF. The code expects an RC4 password at keyRC4_PASSWORD(first value used). Example format (simple key=value equal file):RC4_PASSWORD=supersecret
-
The code will attempt to decrypt
emailEncryptedfields usingCrypt::RC4and Base64 decode if the config file contains the RC4 password. If not present, the code tries a vCard endpoint to retrieve the public email.
- Two upstream endpoints are used:
CustomAPI/v2/Default.aspx: used to map identifiers (FNO, EmployeeID, Person, PrettyURL, etc.) to an internalrdf:aboutnode URI for the person./ORNG/JSONLD/Default.aspx: used to fetch expanded JSON-LD for a given subject (node id) withexpand=trueandshowdetails=true. This JSON-LD contains the@graphof items used to build the simplified JSON.
- The code parses the JSON-LD
@graphinto internal structures (items_by_url_id,publications_by_author,research_activities_and_funding_by_role,orng_data, etc.) and normalizes inconsistently shaped fields (singletons vs arrays, double-encoded JSON strings, chunked ORNG gadget data). - A final hashref is composed (
Profiles => [ ... ]) with normalized fields such asName,Email,ProfilesURL,Titles,Address,Publications,Education_Training,MediaLinks,GlobalHealth,ClinicalTrials,ResearchActivitiesAndFunding, and more. The result is encoded to JSON and returned as the HTTP response body.
- The repository includes tests that exercise many real Profiles records. Be aware these are integration-like tests and will call upstream Profiles servers; tests can be skipped when upstream data is missing.
- To run the tests locally:
- Install dependencies (see
cpanfile/Cpanfileor install required modules). Dependencies includeLWP::UserAgent,JSON,Data::Visitor::Callback,Crypt::RC4(optional),CHI,Test::More, etc. - Run tests with:
prove -lv t/library-mega.torperl -Ilib t/library-mega.t - Individual tests may be skipped depending on upstream availability; the test suite expects many assertions but several are guarded with
SKIPblocks when live data cannot be fetched.
- Install dependencies (see
-
Start the PSGI app with
plackup:plackup -p 5000 app.psgi
Then query:
curl "http://localhost:5000/?ProfilesURLName=anirvan.chatterjee&source=localtest"
- If you see many upstream connection timeouts, consider increasing the
timeoutparameter in the client request or running withcache=alwaysto avoid upstream fetches. - If email fields are missing, verify
profilesdotjson.confcontains the correct RC4 password if your Profiles instance uses encrypted publicly-visible emails. Otherwise, the module tries the vCard endpoint. - Inconsistent or missing gadget data (ORNG) may be due to different ORNG gadget implementations; the code attempts to decode many shapes (string, array, chunked pieces) but if you see odd results, capture upstream JSON-LD and open an issue.