PageRenderTime 352ms CodeModel.GetById 17ms RepoModel.GetById 0ms app.codeStats 0ms

/content/2021/look/index.md

https://github.com/mre/mre.github.io
Markdown | 145 lines | 111 code | 34 blank | 0 comment | 0 complexity | 5ec562040a0e94906c95cbe39c9d02bc MD5 | raw file
  1. +++
  2. title="The `look` Unix command"
  3. date=2021-08-28
  4. draft=true
  5. [taxonomies]
  6. tags=["dev", "rust"]
  7. +++
  8. Ever heard of the `look` Unix command?
  9. Me neither. And yet it is installed on my computer — and probably also yours.
  10. Try it!
  11. ```
  12. > look rust
  13. rust
  14. rustable
  15. rustful
  16. rustic
  17. rustical
  18. rustically
  19. rusticalness
  20. rusticate
  21. rustication
  22. rusticator
  23. rusticial
  24. rusticism
  25. ...
  26. ```
  27. It displays any lines in file which contain string as a prefix.
  28. And it's been around for a while, like since "Version 7 AT&T UNIX", according to `man look`.
  29. That's like 1979 and I find out about it in 2021. Oh well.
  30. Anyhow, let's write a Rust thing, shall we?
  31. ```rust
  32. use std::{
  33. error::Error,
  34. fs::File,
  35. io::{self, BufRead},
  36. path::Path,
  37. };
  38. const USAGE: &str = "usage: look [-df] [-t char] string [file ...]";
  39. // https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html
  40. fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
  41. where
  42. P: AsRef<Path>,
  43. {
  44. let file = File::open(filename)?;
  45. Ok(io::BufReader::new(file).lines())
  46. }
  47. fn main() -> Result<(), Box<dyn Error>> {
  48. let prefix = std::env::args().nth(1).unwrap_or_else(|| {
  49. println!("{}", USAGE);
  50. std::process::exit(2);
  51. });
  52. for line in read_lines("/usr/share/dict/words")? {
  53. if let Ok(line) = line {
  54. if line.to_lowercase().starts_with(&prefix) {
  55. println!("{}", line);
  56. }
  57. }
  58. }
  59. Ok(())
  60. }
  61. ```
  62. That produces the same output on my MacBook than the system version.
  63. ```
  64. > cargo run -- rust | wc -l
  65. 33
  66. > look rust | wc -l
  67. 33
  68. ```
  69. Is it equally fast?
  70. ```
  71. brew install hyperfine
  72. ```
  73. ```
  74. hyperfine --warmup 5 'look rust' 'target/release/lookrs rust'
  75. Benchmark #1: look rust
  76. Time (mean ± σ): 4.4 ms ± 6.9 ms [User: 0.7 ms, System: 1.2 ms]
  77. Range (min … max): 0.0 ms … 25.4 ms 123 runs
  78. Benchmark #2: target/release/lookrs rust
  79. Time (mean ± σ): 82.6 ms ± 12.9 ms [User: 73.5 ms, System: 2.7 ms]
  80. Range (min … max): 63.0 ms … 111.7 ms 33 runs
  81. Summary
  82. 'look rust' ran
  83. 18.92 ± 30.14 times faster than 'target/release/lookrs rust'
  84. ```
  85. Nope, my naive version is about 20x slower.
  86. I just remember that [ripgrep](https://github.com/BurntSushi/ripgrep) could also be used for that.
  87. It supports `\b` to denote word boundaries:
  88. ```
  89. rg '\brust' /usr/share/dict/words
  90. ```
  91. Checking with hyperfine, it's 3x slower than `look`:
  92. ```
  93. hyperfine --warmup 5 'rg '\brust' /usr/share/dict/words' ✘
  94. Benchmark #1: rg brust /usr/share/dict/words
  95. Time (mean ± σ): 12.3 ms ± 7.4 ms [User: 3.9 ms, System: 2.9 ms]
  96. Range (min … max): 4.4 ms … 26.6 ms 198 runs
  97. ```
  98. The `look` manpage gives a hint:
  99. > **As look performs a binary search, the lines in file must be sorted.**
  100. And indeed if we randomize the input, `look` becomes completely useless:
  101. ```
  102. cat /usr/share/dict/words | sort -R > random.txt
  103. look rust random.txt
  104. *No output*
  105. ```
  106. (Note that you might get results based on the randomization.)
  107. But back in the 70's, that tradeoff made sense: storage space was limited and CPUs were slow,
  108. so requiring a sorted file allowed for much shorter seek times.
  109. Let's peek into [the `look` Darwin source code](https://opensource.apple.com/source/text_cmds/text_cmds-106/look/look.c.auto.html) to see how they do it, shall we?
  110. ```c
  111. ```