Skip to content

Refactor UrlId into FilePath enum#1251

Open
lionel- wants to merge 11 commits into
oak/salsa-14-canonicalisationfrom
oak/salsa-15-file-path-refactor
Open

Refactor UrlId into FilePath enum#1251
lionel- wants to merge 11 commits into
oak/salsa-14-canonicalisationfrom
oak/salsa-15-file-path-refactor

Conversation

@lionel-

@lionel- lionel- commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Branched from #1250 (see that PR first for context)
Progress towards #1212

  • Rename UrlId to FilePath
  • Make it our own custom enum instead of piggybacking on the Url enum. Variants: File (lexically normalised) and Virtual (wrap an Uri e.g. for untitled files). We could add Vendored later for vendored base R sources, mirroring ty.

This refactor better reflects that the main representation for our files are on-disk paths.

@lionel- lionel- force-pushed the oak/salsa-14-canonicalisation branch from 0e6a188 to eb2c844 Compare June 4, 2026 14:19
@lionel- lionel- force-pushed the oak/salsa-15-file-path-refactor branch from 4ac9528 to e807fd2 Compare June 4, 2026 14:19

@DavisVaughan DavisVaughan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of FilePath, but we are currently in a state where A LOT of existing code is still using url, url_id, or something similar, and I think it results in a net negative in terms of code readability. I have marked a few cases, but have definitely not captured them all. I'd like to advocate for a close review of all uses of FilePath before merging to ensure that the corresponding binding is also called path or file_path.

Comment thread crates/aether_path/src/file_path.rs Outdated

/// Build a [`FilePath::File`] from a filesystem path. Errors if
/// the path can't be expressed as a UTF-8 absolute path.
pub fn from_file_path(path: impl AsRef<Path>) -> anyhow::Result<Self> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just be from_path() now, because I saw FilePath::from_file_path() elsewhere and was pretty confused by that naming

Or actually, probably from_path_buf() to mimic the AbsPathBuf change

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pondered these alternatives and went with from_file_path() for consistency with UrlId. Happy to go back on that though.

Comment thread crates/aether_path/src/file_path.rs Outdated
Comment thread crates/aether_path/src/file_path.rs Outdated
Comment thread crates/ark/src/dap/dap_state.rs Outdated
Comment on lines 302 to 308
fn canonical_path(url: &FilePath) -> Option<PathBuf> {
if !url.is_file() {
return None;
}
let path = url.to_file_path().ok()?;
let path = url.to_path_buf()?;
std::fs::canonicalize(path).ok()
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn canonical_path(url: &FilePath) -> Option<PathBuf> {
if !url.is_file() {
return None;
}
let path = url.to_file_path().ok()?;
let path = url.to_path_buf()?;
std::fs::canonicalize(path).ok()
}
fn canonical_path(url: &FilePath) -> Option<PathBuf> {
match url {
FilePath::File(path) => std::fs::canonicalize(path.as_path()).ok(),
FilePath::Virtual(_) => None,
}
}

Hmm. I'd like to challenge the addition of the to_path_buf() method.

I understand the desire to have a way to go from FilePath -> PathBuf for compatibility, but I have a strong feeling that it would be better to stay in UTF8PathBuf form as long as possible.

Like, I think this rewrite of canonical_path() is more obvious and takes advantage of UTF8PathBuf features

Can we avoid adding to_path_buf() entirely in favor of rewrites likes this one? And should canonical_path() itself also return another UTF8PathBuf?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I've removed it

Comment on lines +265 to +266
#[cfg(test)]
mod tests;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in lib.rs?

Comment thread crates/oak_scan/src/watch.rs Outdated
Comment thread crates/oak_scan/src/watch.rs Outdated
Comment on lines 59 to 62
let Some(path) = url.to_path_buf() else {
log::warn!("Skipping add_watched_file: URL is not a file path");
return;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let Some(path) = url.to_path_buf() else {
log::warn!("Skipping add_watched_file: URL is not a file path");
return;
};
let Some(path) = url.as_file() else {
log::warn!("Skipping add_watched_file: URL is not a file path");
return;
};
let Some(placement) = classify(db, path.as_path().as_std_path()) else {
// Either the URL falls outside every workspace, or it lives
// inside a package subdir we don't track (tests/, inst/, ...).
return;
};

i would have written as this to avoid the expensive to_path_buf(). no need to make a clone here.

more evidence to me that to_path_buf() is a possible anti pattern

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and really we'd probably make classify() take &Utf8PathBuf

Comment thread crates/oak_db/src/inputs.rs Outdated
Comment thread crates/oak_db/src/db.rs Outdated
Comment thread crates/ark/src/url.rs Outdated
@lionel- lionel- force-pushed the oak/salsa-14-canonicalisation branch from eb2c844 to 85cdf4f Compare June 10, 2026 14:31
@lionel- lionel- force-pushed the oak/salsa-15-file-path-refactor branch from e807fd2 to 10deebe Compare June 10, 2026 14:32
@lionel- lionel- force-pushed the oak/salsa-14-canonicalisation branch from 85cdf4f to 9463329 Compare June 11, 2026 06:26
@lionel- lionel- force-pushed the oak/salsa-15-file-path-refactor branch from 10deebe to 276eba8 Compare June 11, 2026 07:42
@lionel- lionel- force-pushed the oak/salsa-15-file-path-refactor branch from 26ed4dd to 11deb77 Compare June 11, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants