/share/man/man4/polling.4

https://bitbucket.org/freebsd/freebsd-head/ · Forth · 222 lines · 222 code · 0 blank · 0 comment · 10 complexity · 85c85875221ab3bbbf6add3f6ccbd8c3 MD5 · raw file

  1. .\" Copyright (c) 2002 Luigi Rizzo
  2. .\" All rights reserved.
  3. .\"
  4. .\" Redistribution and use in source and binary forms, with or without
  5. .\" modification, are permitted provided that the following conditions
  6. .\" are met:
  7. .\" 1. Redistributions of source code must retain the above copyright
  8. .\" notice, this list of conditions and the following disclaimer.
  9. .\" 2. Redistributions in binary form must reproduce the above copyright
  10. .\" notice, this list of conditions and the following disclaimer in the
  11. .\" documentation and/or other materials provided with the distribution.
  12. .\"
  13. .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  14. .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  15. .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  16. .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  17. .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  18. .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  19. .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  20. .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  21. .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  22. .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  23. .\" SUCH DAMAGE.
  24. .\"
  25. .\" $FreeBSD$
  26. .\"
  27. .Dd April 6, 2007
  28. .Dt POLLING 4
  29. .Os
  30. .Sh NAME
  31. .Nm polling
  32. .Nd device polling support
  33. .Sh SYNOPSIS
  34. .Cd "options DEVICE_POLLING"
  35. .Sh DESCRIPTION
  36. Device polling
  37. .Nm (
  38. for brevity) refers to a technique that
  39. lets the operating system periodically poll devices, instead of
  40. relying on the devices to generate interrupts when they need attention.
  41. This might seem inefficient and counterintuitive, but when done
  42. properly,
  43. .Nm
  44. gives more control to the operating system on
  45. when and how to handle devices, with a number of advantages in terms
  46. of system responsiveness and performance.
  47. .Pp
  48. In particular,
  49. .Nm
  50. reduces the overhead for context
  51. switches which is incurred when servicing interrupts, and
  52. gives more control on the scheduling of the CPU between various
  53. tasks (user processes, software interrupts, device handling)
  54. which ultimately reduces the chances of livelock in the system.
  55. .Ss Principles of Operation
  56. In the normal, interrupt-based mode, devices generate an interrupt
  57. whenever they need attention.
  58. This in turn causes a
  59. context switch and the execution of an interrupt handler
  60. which performs whatever processing is needed by the device.
  61. The duration of the interrupt handler is potentially unbounded
  62. unless the device driver has been programmed with real-time
  63. concerns in mind (which is generally not the case for
  64. .Fx
  65. drivers).
  66. Furthermore, under heavy traffic load, the system might be
  67. persistently processing interrupts without being able to
  68. complete other work, either in the kernel or in userland.
  69. .Pp
  70. Device polling disables interrupts by polling devices at appropriate
  71. times, i.e., on clock interrupts and within the idle loop.
  72. This way, the context switch overhead is removed.
  73. Furthermore,
  74. the operating system can control accurately how much work to spend
  75. in handling device events, and thus prevent livelock by reserving
  76. some amount of CPU to other tasks.
  77. .Pp
  78. Enabling
  79. .Nm
  80. also changes the way software network interrupts
  81. are scheduled, so there is never the risk of livelock because
  82. packets are not processed to completion.
  83. .Ss Enabling polling
  84. Currently only network interface drivers support the
  85. .Nm
  86. feature.
  87. It is turned on and off with help of
  88. .Xr ifconfig 8
  89. command.
  90. .Pp
  91. The historic
  92. .Va kern.polling.enable ,
  93. which enabled polling for all interfaces, can be replaced with the following
  94. code:
  95. .Bd -literal
  96. for i in `ifconfig -l` ;
  97. do ifconfig $i polling; # use -polling to disable
  98. done
  99. .Ed
  100. .Ss MIB Variables
  101. The operation of
  102. .Nm
  103. is controlled by the following
  104. .Xr sysctl 8
  105. MIB variables:
  106. .Pp
  107. .Bl -tag -width indent -compact
  108. .It Va kern.polling.user_frac
  109. When
  110. .Nm
  111. is enabled, and provided that there is some work to do,
  112. up to this percent of the CPU cycles is reserved to userland tasks,
  113. the remaining fraction being available for
  114. .Nm
  115. processing.
  116. Default is 50.
  117. .Pp
  118. .It Va kern.polling.burst
  119. Maximum number of packets grabbed from each network interface in
  120. each timer tick.
  121. This number is dynamically adjusted by the kernel,
  122. according to the programmed
  123. .Va user_frac , burst_max ,
  124. CPU speed, and system load.
  125. .Pp
  126. .It Va kern.polling.each_burst
  127. The burst above is split into smaller chunks of this number of
  128. packets, going round-robin among all interfaces registered for
  129. .Nm .
  130. This prevents the case that a large burst from a single interface
  131. can saturate the IP interrupt queue
  132. .Pq Va net.inet.ip.intr_queue_maxlen .
  133. Default is 5.
  134. .Pp
  135. .It Va kern.polling.burst_max
  136. Upper bound for
  137. .Va kern.polling.burst .
  138. Note that when
  139. .Nm
  140. is enabled, each interface can receive at most
  141. .Pq Va HZ No * Va burst_max
  142. packets per second unless there are spare CPU cycles available for
  143. .Nm
  144. in the idle loop.
  145. This number should be tuned to match the expected load
  146. (which can be quite high with GigE cards).
  147. Default is 150 which is adequate for 100Mbit network and HZ=1000.
  148. .Pp
  149. .It Va kern.polling.idle_poll
  150. Controls if
  151. .Nm
  152. is enabled in the idle loop.
  153. There are no reasons (other than power saving or bugs in the scheduler's
  154. handling of idle priority kernel threads) to disable this.
  155. .Pp
  156. .It Va kern.polling.reg_frac
  157. Controls how often (every
  158. .Va reg_frac No / Va HZ
  159. seconds) the status registers of the device are checked for error
  160. conditions and the like.
  161. Increasing this value reduces the load on the bus, but also delays
  162. the error detection.
  163. Default is 20.
  164. .Pp
  165. .It Va kern.polling.handlers
  166. How many active devices have registered for
  167. .Nm .
  168. .Pp
  169. .It Va kern.polling.short_ticks
  170. .It Va kern.polling.lost_polls
  171. .It Va kern.polling.pending_polls
  172. .It Va kern.polling.residual_burst
  173. .It Va kern.polling.phase
  174. .It Va kern.polling.suspect
  175. .It Va kern.polling.stalled
  176. Debugging variables.
  177. .El
  178. .Sh SUPPORTED DEVICES
  179. Device polling requires explicit modifications to the device drivers.
  180. As of this writing, the
  181. .Xr bge 4 ,
  182. .Xr dc 4 ,
  183. .Xr em 4 ,
  184. .Xr fwe 4 ,
  185. .Xr fwip 4 ,
  186. .Xr fxp 4 ,
  187. .Xr igb 4 ,
  188. .Xr ixgb 4 ,
  189. .Xr nfe 4 ,
  190. .Xr nge 4 ,
  191. .Xr re 4 ,
  192. .Xr rl 4 ,
  193. .Xr sf 4 ,
  194. .Xr sis 4 ,
  195. .Xr ste 4 ,
  196. .Xr stge 4 ,
  197. .Xr vge 4 ,
  198. .Xr vr 4 ,
  199. and
  200. .Xr xl 4
  201. devices are supported, with others in the works.
  202. The modifications are rather straightforward, consisting in
  203. the extraction of the inner part of the interrupt service routine
  204. and writing a callback function,
  205. .Fn *_poll ,
  206. which is invoked
  207. to probe the device for events and process them.
  208. (See the
  209. conditionally compiled sections of the devices mentioned above
  210. for more details.)
  211. .Pp
  212. As in the worst case the devices are only polled on clock interrupts,
  213. in order to reduce the latency in processing packets, it is not advisable
  214. to decrease the frequency of the clock below 1000 Hz.
  215. .Sh HISTORY
  216. Device polling first appeared in
  217. .Fx 4.6
  218. and
  219. .Fx 5.0 .
  220. .Sh AUTHORS
  221. Device polling was written by
  222. .An Luigi Rizzo Aq luigi@iet.unipi.it .