PageRenderTime 62ms CodeModel.GetById 35ms RepoModel.GetById 0ms app.codeStats 0ms

/tensorflow/tensorboard/http_api.md

https://gitlab.com/hrishikeshvganu/tensorflow
Markdown | 344 lines | 281 code | 63 blank | 0 comment | 0 complexity | 8c26a3ae26e1806607721212bdb0f9d2 MD5 | raw file
  1. # Tensorboard client-server HTTP API
  2. ## Runs, Tags, and Tag Types
  3. TensorBoard data is organized around the concept of a `run`, which represents
  4. all the related data thrown off by a single execution of TensorFlow, a `tag`,
  5. which groups values of data that come from the same source within a TensorFlow
  6. run, and `tag types`, which are our way of distinguishing different types of
  7. data that have fundamentally different representations and should be processed
  8. on different code paths. For example, a "train" run may have a `scalars`
  9. tag that represents the learning rate, another `scalars` tag that
  10. represents the value of the objective function, a `histograms` tag that reveals
  11. information on weights in a particular layer over time, and an `images` tag that
  12. shows input images flowing into the system. The "eval" run might have an
  13. entirely different set of tag names, or some duplicated tag names.
  14. The currently supported tag types are `scalars`, `images`, `audio`,
  15. `histograms`, `graph` and `run_metadata`. Each tag type corresponds to a route
  16. (documented below) for retrieving tag data of that type.
  17. All of the data provided comes from TensorFlow events files ('\*.tfevents\*'),
  18. which are written using the SummaryWriter class
  19. (tensorflow/python/training/summary_writer.py), and the data is generated by
  20. summary ops (tensorflow/python/ops/summary_ops.py). The `scalars` come from the
  21. `ScalarSummary` op, the `histograms` from the `HistogramSummary`, the `audio`
  22. from the `AudioSummary`, and the `images` from `ImageSummary`. The tag type
  23. `graph` is special in that it is not a collection of tags of that type, but a
  24. boolean denoting if there is a graph definition associated with the run. The tag
  25. is provided to the summary op (usually as a constant).
  26. ## `data/runs`
  27. Returns a dictionary mapping from `run name` (quoted string) to dictionaries
  28. mapping from all available tagTypes to a list of tags of that type available for
  29. the run. Think of this as a comprehensive index of all of the data available
  30. from the TensorBoard server. Here is an example:
  31. {
  32. "train_run": {
  33. "histograms": ["foo_histogram", "bar_histogram"],
  34. "compressedHistograms": ["foo_histogram", "bar_histogram"],
  35. "scalars": ["xent", "loss", "learning_rate"],
  36. "images": ["input"],
  37. "audio": ["input_audio"],
  38. "graph": true,
  39. "run_metadata": ["forward prop", "inference"]
  40. },
  41. "eval": {
  42. "histograms": ["foo_histogram", "bar_histogram"],
  43. "compressedHistograms": ["foo_histogram", "bar_histogram"],
  44. "scalars": ["precision", "recall"],
  45. "images": ["input"],
  46. "audio": ["input_audio"],
  47. "graph": false,
  48. "run_metadata": []
  49. }
  50. }
  51. Note that the same tag may be present for many runs. It is not guaranteed that
  52. they will have the same meaning across runs. It is also not guaranteed that they
  53. will have the same tag type across different runs.
  54. ## '/data/scalars?run=foo&tag=bar'
  55. Returns an array of event_accumulator.SimpleValueEvents ([wall_time, step,
  56. value]) for the given run and tag. wall_time is seconds since epoch.
  57. Example:
  58. [
  59. [1443856985.705543, 1448, 0.7461960315704346], # wall_time, step, value
  60. [1443857105.704628, 3438, 0.5427092909812927],
  61. [1443857225.705133, 5417, 0.5457325577735901],
  62. ...
  63. ]
  64. If the format parameter is set to 'csv', the response will instead be in CSV
  65. format:
  66. Wall time,step,value
  67. 1443856985.705543,1448,0.7461960315704346
  68. 1443857105.704628,3438,0.5427092909812927
  69. 1443857225.705133,5417,0.5457325577735901
  70. ## '/data/scalars?[sample_count=10]'
  71. Without any parameters, returns a dictionary mapping from run name to a
  72. dictionary mapping from tag name to a sampled list of scalars from that run and
  73. tag. The values are given in the same format as when the run and tag are
  74. specified. For example:
  75. {
  76. "train_run": {
  77. "my_tag": [
  78. [1443856985.705543, 1448, 0.7461960315704346],
  79. [1443857105.704628, 3438, 0.5427092909812927],
  80. [1443857225.705133, 5417, 0.5457325577735901]
  81. ]
  82. }
  83. }
  84. The samples are distributed uniformly over the list of values. The sample_count
  85. parameter is optional and defaults to 10; it must be at least 2. The first and
  86. the last value will always be sampled.
  87. ## '/data/histograms?run=foo&tag=bar'
  88. Returns an array of event_accumulator.HistogramEvents ([wall_time, step,
  89. HistogramValue]) for the given run and tag. A HistogramValue is [min, max, num,
  90. sum, sum_squares, bucket_limit, bucket]. wall_time is seconds since epoch.
  91. Annotated Example: (note - real data is higher precision)
  92. [
  93. [
  94. 1443871386.185149, # wall_time
  95. 235166, # step
  96. [
  97. -0.66, # minimum value
  98. 0.44, # maximum value
  99. 8.0, # number of items in the histogram
  100. -0.80, # sum of items in the histogram
  101. 0.73, # sum of squares of items in the histogram
  102. [-0.68, -0.62, -0.292, -0.26, -0.11, -0.10, -0.08, -0.07, -0.05,
  103. -0.0525, -0.0434, -0.039, -0.029, -0.026, 0.42, 0.47, 1.8e+308],
  104. # the right edge of each bucket
  105. [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0,
  106. 1.0, 0.0] # the number of elements within each bucket
  107. ]
  108. ]
  109. ]
  110. ## '/data/compressedHistograms?run=foo&tag=bar'
  111. Returns an array of event_accumulator.CompressedHistogramEvents ([wall_time,
  112. step, CompressedHistogramValues]) for the given run and tag.
  113. CompressedHistogramValues is a list of namedtuples with each tuple specifying
  114. a basis point (bps) as well as an interpolated value of the histogram value
  115. at that basis point. A basis point is 1/100 of a percent.
  116. The current compression strategy is to choose basis points that correspond to
  117. the median and bands of 1SD, 2SD, and 3SDs around the median. Note that the
  118. current compression strategy does not work well for representing multimodal
  119. data -- this is something that will be improved in a later iteration.
  120. Annotated Example: (note - real data is higher precision)
  121. [
  122. [
  123. 1441154832.580509, # wall_time
  124. 5, # step
  125. [ [0, -3.67], # CompressedHistogramValue for 0th percentile
  126. [2500, -4.19], # CompressedHistogramValue for 25th percentile
  127. [5000, 6.29],
  128. [7500, 1.64],
  129. [10000, 3.67]
  130. ]
  131. ],
  132. ...
  133. ]
  134. ## `/data/images?run=foo&tag=bar`
  135. Gets a sample of ImageMetadatas for the given run and tag.
  136. Returns an array of objects containing information about available images,
  137. crucially including the query parameter that may be used to retrieve that image.
  138. (See /individualImage for details.)
  139. For example:
  140. {
  141. "width": 28, # width in pixels
  142. "height": 28, # height in pixels
  143. "wall_time": 1440210599.246, # time in seconds since epoch
  144. "step": 63702821, # number of steps that have passed
  145. "query": "index=0&tagname=input%2Fimage%2F2&run=train"
  146. # param for /individualImage
  147. }
  148. ## `/data/individualImage?{{query}}`
  149. Retrieves an individual image. The image query should not be generated by the
  150. frontend, but instead acquired from calling the /images route (the image
  151. metadata objects contain the query to use). The response is the image itself
  152. with mime-type 'image/png'.
  153. Note that the query is not guaranteed to always refer to the same image even
  154. within a single run, as images may be removed from the sampling reservoir and
  155. replaced with other images. (See Notes for details on the reservoir sampling.)
  156. An example call to this route would look like this:
  157. /individualImage?index=0&tagname=input%2Fimage%2F2&run=train
  158. ## `/audio?run=foo&tag=bar`
  159. Gets a sample of AudioMetadatas for the given run and tag.
  160. Returns an array of objects containing information about available audio,
  161. crucially including the query parameter that may be used to retrieve that audio.
  162. (See /individualAudio for details.)
  163. For example:
  164. {
  165. "wall_time": 1440210599.246, # time in seconds since epoch
  166. "step": 63702821, # number of steps that have passed
  167. "content_type": "audio/wav" # the MIME-type of the audio
  168. "query": "index=0&tagname=input%2Faudio%2F2&run=train"
  169. # param for /individualAudio
  170. }
  171. ## `/individualAudio?{{query}}`
  172. Retrieves an individual audio clip. The audio query should not be generated by
  173. the frontend, but instead acquired from calling the /audio route (the audio
  174. metadata objects contain the query to use). The response is the audio itself
  175. with an appropriate Content-Type header set.
  176. Note that the query is not guaranteed to always refer to the same clip even
  177. within a single run, as audio may be removed from the sampling reservoir and
  178. replaced with other clips. (See Notes for details on the reservoir sampling.)
  179. An example call to this route would look like this:
  180. /individualAudio?index=0&tagname=input%2Faudio%2F2&run=train
  181. ## `/data/graph?run=foo&limit_attr_size=1024&large_attrs_key=key`
  182. Returns the graph definition for the given run in gzipped pbtxt format. The
  183. graph is composed of a list of nodes, where each node is a specific TensorFlow
  184. operation which takes as inputs other nodes (operations).
  185. The query parameters `limit_attr_size` and `large_attrs_key` are optional.
  186. `limit_attr_size` specifies the maximum allowed size in bytes, before the
  187. attribute is considered large and filtered out of the graph. If specified,
  188. it must be an int and > 0. If not specified, no filtering is applied.
  189. `large_attrs_key` is the attribute key that will be used for storing
  190. attributes that are too large. The value of this key (list of strings)
  191. should be used by the client in order to determine which attributes
  192. have been filtered. Must be specified if `limit_attr_size` is specified.
  193. For the query `/graph?run=foo&limit_attr_size=1024&large_attrs_key=_too_large`,
  194. here is an example pbtxt response of a graph with 3 nodes, where the second
  195. node had two large attributes "a" and "b" that were filtered out (size > 1024):
  196. node {
  197. op: "Input"
  198. name: "A"
  199. }
  200. node {
  201. op: "Input"
  202. name: "B"
  203. attr {
  204. key: "small_attr"
  205. value: {
  206. s: "some string"
  207. }
  208. }
  209. attr {
  210. key: "_too_large"
  211. value {
  212. list {
  213. s: "a"
  214. s: "b"
  215. }
  216. }
  217. }
  218. }
  219. node {
  220. op: "MatMul"
  221. name: "C"
  222. input: "A"
  223. input: "B"
  224. }
  225. Prior to filtering, the original node "B" had the following content:
  226. node {
  227. op: "Input"
  228. name: "B"
  229. attr {
  230. key: "small_attr"
  231. value: {
  232. s: "some string"
  233. }
  234. }
  235. attr {
  236. key: "a"
  237. value { Very large object... }
  238. }
  239. attr {
  240. key: "b"
  241. value { Very large object... }
  242. }
  243. }
  244. ## `/data/run_metadata?run=foo&tag=bar`
  245. Given a run and tag, returns the metadata of a particular
  246. `session.run()` as a gzipped, pbtxt serialized [`RunMetadata`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto)
  247. proto. For example:
  248. step_stats {
  249. dev_stats {
  250. device: "/job:localhost/replica:0/task:0/cpu:0"
  251. node_stats {
  252. node_name: "_SOURCE"
  253. all_start_micros: 1458337695775395
  254. op_start_rel_micros: 11
  255. op_end_rel_micros: 12
  256. all_end_rel_micros: 38
  257. memory {
  258. allocator_name: "cpu"
  259. }
  260. timeline_label: "_SOURCE = NoOp()"
  261. scheduled_micros: 1458337695775363
  262. }
  263. }
  264. }
  265. ## Notes
  266. All returned values, histograms, audio, and images are returned in the order
  267. they were written by Tensorflow (which should correspond to increasing
  268. `wall_time` order, but may not necessarily correspond to increasing step count
  269. if the process had to restart from a previous checkpoint).
  270. The returned values may be downsampled using reservoir sampling, which is
  271. configurable by the TensorBoard server. When downsampling occurs, the server
  272. guarantees that different tags will all sample at the same sequence of indices,
  273. so that if there are two tags `A` and `B` which are related so that `A[i] ~
  274. B[i]` for all `i`, then `D(A)[i] ~ D(B)[i]` for all `i`, where `D` represents
  275. the downsampling operation.
  276. The reservoir sampling puts an upper bound on the number of items that will be
  277. returned for a given run-tag combination, and guarantees that all items are
  278. equally likely to be in the final sample (ie it is a uniform distribution over
  279. the values), with the proviso that the most recent individual item is always
  280. included in the sample.
  281. The reservoir sizes are configurable on a per-tag type basis.