/testing/libfuzzer/fuzzers/url.proto

http://github.com/chromium/chromium · Protocol Buffers · 94 lines · 22 code · 12 blank · 60 comment · 0 complexity · 314d8dc24b680c13aca9dd6127f4c568 MD5 · raw file

  1. // Copyright 2017 The Chromium Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file.
  4. // This file contains the definition of the Url protobuf used in the
  5. // url_parse_proto_fuzzer that is meant to serve as an example for future
  6. // Chromium fuzzers that use libprotobuf-mutator.
  7. // We will consider the format of a URL for this fuzzer, to be
  8. // [scheme:][//[user[:password]@]host[:port]][/path][?query][#value]
  9. // There may be some URLs Chromium treats as valid that this syntax does not
  10. // capture. However, we will ignore them for the sake of simplicity.
  11. // It is recommended to read this file in conjunction with
  12. // convert_protobuf_to_string() in url_parse_proto_fuzzer.cc as logic in this
  13. // function is sometimes used to ensure that the Url Protocol Buffer obeys the
  14. // syntax we have defined for URLs. Though reading it is completely unecessary
  15. // for understand this fuzzer, we have roughly followed RFC 3986
  16. // (https://tools.ietf.org/html/rfc3986) which defines the syntax of URIs (which
  17. // URLs are a subset of).
  18. syntax = "proto2";
  19. package url_parse_proto_fuzzer;
  20. // Here we define the format for a Url Protocol Buffer. This will be passed to
  21. // our fuzzer function.
  22. message Url {
  23. // If there is a scheme, then it must be followed by a colon. A scheme is in
  24. // practice not required in a URL. Therefore, we will define the scheme as
  25. // optional but ensure it is followed by a colon in our conversion code if it
  26. // is included.
  27. optional string scheme = 1;
  28. enum Slash {
  29. NONE = 0; // Seperate path segments using ""
  30. FORWARD = 1; // Seperate path segments using /
  31. BACKWARD = 2; // Seperate path segments using \
  32. }
  33. // The syntax rules of the two slashes that precede the host in a URL are
  34. // surprisingly complex. They are not required, even if a scheme is included
  35. // (http:example.com is treated as valid), and are valid even if a scheme is
  36. // not included (//example.com is treated as file:///example.com). They can
  37. // even be backslashes (http:\\example.com and http\/example.com are both
  38. // valid) and there can be any number of them (http:/example.com and
  39. // http://////example.com are both valid).
  40. // We will therefore define slashes as a list of enum values (repeated Slash).
  41. // In our conversion code, this will be read to append the appropriate kind and
  42. // appropriate number of slashes to the URL.
  43. repeated Slash slashes = 2 [packed=true];
  44. // The [user:password@] part of the URL shown above is called the userinfo.
  45. // Userinfo is not mandatory, but if it is included in a URL, then it must
  46. // contain a string called user. There is another optional field in userinfo
  47. // called the password. If a password is included, the user must be separated
  48. // from it by ":". In either case, the userinfo must be separated from the
  49. // host by "@". A URL must have a host if it has a userinfo.
  50. // These requirements will be ensured by the conversion code.
  51. message Userinfo {
  52. required string user = 1;
  53. optional string password = 2;
  54. }
  55. optional Userinfo userinfo = 3;
  56. // Hosts, like most else in our Url definition, are optional (there are
  57. // are URLs such as data URLs that do not have hosts).
  58. optional string host = 4;
  59. // ports are unsigned integers between 1-2^16. The closest type to this in the
  60. // proto2 format is uint32. Also if a port number is specified it must be
  61. // preceded by a colon (consider "google.com80" 80 will be interpreted as part
  62. // of the host). The conversion code will ensure this is the case.
  63. optional uint32 port = 5;
  64. // The rules for the path are somewhat complex. A path is not required,
  65. // however if it follows a port or host, it must start with "/" according
  66. // to the RFC, though Chromium accepts "\" as it converts all backslashes to
  67. // slashes. It does not need to start with "/" if there is no host (in data
  68. // URLs for example). Thus we will define path as a repeated string where each
  69. // member contains a segment of the path and will be preceded by the
  70. // path_separator. The one exception to this is for the first segment if
  71. // path_seperator == NONE and there is a non empty path and host, then the
  72. // first segment will be preceeded by "/".
  73. repeated string path = 6;
  74. required Slash path_separator = 7 [default = FORWARD];
  75. // A query must preceded by "?". This will be ensured in the conversion
  76. // code. Queries can have many components which the converter will separate
  77. // using "&", as is the convention.
  78. repeated string query = 8;
  79. // A fragment must preceded by "#". This will be ensured in the conversion
  80. // code.
  81. optional string fragment = 9;
  82. }