PageRenderTime 48ms CodeModel.GetById 18ms RepoModel.GetById 1ms app.codeStats 0ms

/docs/manual/source/templates/classification/quickstart.html.md.erb

https://gitlab.com/github-cloud-corporation/incubator-predictionio
Ruby HTML | 466 lines | 378 code | 88 blank | 0 comment | 12 complexity | 26d3065d541728e33c8fdf9609e5a7e7 MD5 | raw file
  1. ---
  2. title: Quick Start - Classification Engine Template
  3. ---
  4. ## Overview
  5. An engine template is an almost-complete implementation of an engine.
  6. PredictionIO's Classification Engine Template
  7. has integrated **Apache Spark MLlib**'s Naive Bayes algorithm by default.
  8. The default use case of Classification Engine Template is to predict the service
  9. plan (*plan*) a user will subscribe to based on his 3 properties: *attr0*,
  10. *attr1* and *attr2*.
  11. You can customize it easily to fit your specific use case and needs.
  12. We are going to show you how to create your own classification engine for
  13. production use based on this template.
  14. ## Usage
  15. ### Event Data Requirements
  16. By default, the template requires the following events to be collected:
  17. - user $set event, which set the attributes of the user
  18. NOTE: You can customize to use other event.
  19. ### Input Query
  20. - individual attributes values (for version >= v0.3.1)
  21. WARNING: for version < v0.3.1, it is array of features values
  22. ### Output PredictedResult
  23. - the predicted label
  24. ## 1. Install and Run PredictionIO
  25. <%= partial 'shared/quickstart/install' %>
  26. ## 2. Create a new Engine from an Engine Template
  27. <%= partial 'shared/quickstart/create_engine', locals: { engine_name: 'MyClassification', template_name: 'Classification Engine Template', template_repo: 'template-scala-parallel-classification' } %>
  28. ## 3. Generate an App ID and Access Key
  29. <%= partial 'shared/quickstart/create_app' %>
  30. ## 4. Collecting Data
  31. Next, let's collect some training data. By default, the Classification Engine Template reads 4 properties of a user record: attr0, attr1, attr2 and plan. This templates requires '$set' user events.
  32. INFO: This template can easily be customized to use different or more number of attributes.
  33. <%= partial 'shared/quickstart/collect_data' %>
  34. To set properties "attr0", "attr1", "attr2" and "plan" for user "u0" on time `2014-11-02T09:39:45.618-08:00` (current time will be used if eventTime is not specified), you can send `$set` event for the user. To send this event, run the following `curl` command:
  35. <div class="tabs">
  36. <div data-tab="REST API" data-lang="json">
  37. ```
  38. $ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \
  39. -H "Content-Type: application/json" \
  40. -d '{
  41. "event" : "$set",
  42. "entityType" : "user",
  43. "entityId" : "u0",
  44. "properties" : {
  45. "attr0" : 0,
  46. "attr1" : 1,
  47. "attr2" : 0,
  48. "plan" : 1
  49. }
  50. "eventTime" : "2014-11-02T09:39:45.618-08:00"
  51. }'
  52. ```
  53. </div>
  54. <div data-tab="Python SDK" data-lang="python">
  55. ```python
  56. import predictionio
  57. client = predictionio.EventClient(
  58. access_key=<ACCESS KEY>,
  59. url=<URL OF EVENTSERVER>,
  60. threads=5,
  61. qsize=500
  62. )
  63. # Set the 4 properties for a user
  64. client.create_event(
  65. event="$set",
  66. entity_type="user",
  67. entity_id=<USER ID>,
  68. properties= {
  69. "attr0" : int(<VALUE OF ATTR0>),
  70. "attr1" : int(<VALUE OF ATTR1>),
  71. "attr2" : int(<VALUE OF ATTR2>),
  72. "plan" : int(<VALUE OF PLAN>)
  73. }
  74. )
  75. ```
  76. </div>
  77. <div data-tab="PHP SDK" data-lang="php">
  78. ```php
  79. <?php
  80. require_once("vendor/autoload.php");
  81. use predictionio\EventClient;
  82. $client = new EventClient(<ACCESS KEY>, <URL OF EVENTSERVER>);
  83. // Set the 4 properties for a user
  84. $client->createEvent(array(
  85. 'event' => '$set',
  86. 'entityType' => 'user',
  87. 'entityId' => <USER ID>,
  88. 'properties' => array(
  89. 'attr0' => <VALUE OF ATTR0>,
  90. 'attr1' => <VALUE OF ATTR1>,
  91. 'attr2' => <VALUE OF ATTR2>,
  92. 'plan' => <VALUE OF PLAN>
  93. )
  94. ));
  95. ?>
  96. ```
  97. </div>
  98. <div data-tab="Ruby SDK" data-lang="ruby">
  99. ```ruby
  100. # Create a client object.
  101. client = PredictionIO::EventClient.new(<ACCESS KEY>, <URL OF EVENTSERVER>)
  102. # Set the 4 properties for a user.
  103. client.create_event(
  104. '$set',
  105. 'user',
  106. <USER ID>, {
  107. 'properties' => {
  108. 'attr0' => <VALUE OF ATTR0 (integer)>,
  109. 'attr1' => <VALUE OF ATTR1 (integer)>,
  110. 'attr2' => <VALUE OF ATTR2 (integer)>,
  111. 'plan' => <VALUE OF PLAN (integer)>,
  112. }
  113. }
  114. )
  115. ```
  116. </div>
  117. <div data-tab="Java SDK" data-lang="java">
  118. ```java
  119. import com.google.common.collect.ImmutableMap;
  120. import org.apache.predictionio.Event;
  121. import org.apache.predictionio.EventClient;
  122. EventClient client = new EventClient(<ACCESS KEY>, <URL OF EVENTSERVER>);
  123. // set the 4 properties for a user
  124. Event event = new Event()
  125. .event("$set")
  126. .entityType("user")
  127. .entityId(<USER ID>)
  128. .properties(ImmutableMap.<String, Object>of(
  129. "attr0", <VALUE OF ATTR0>,
  130. "attr1", <VALUE OF ATTR1>,
  131. "attr2", <VALUE OF ATTR2>,
  132. "plan", <VALUE OF PLAN>
  133. ));
  134. client.createEvent(event);
  135. ```
  136. </div>
  137. </div>
  138. Note that you can also set the properties for the user with multiple `$set` events (They will be aggregated during engine training).
  139. To set properties "attr0", "attr1" and "attr2", and "plan" for user "u1" at different time, you can send follwing `$set` events for the user. To send these events, run the following `curl` command:
  140. <div class="tabs">
  141. <div data-tab="REST API" data-lang="json">
  142. ```
  143. $ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \
  144. -H "Content-Type: application/json" \
  145. -d '{
  146. "event" : "$set",
  147. "entityType" : "user",
  148. "entityId" : "u1",
  149. "properties" : {
  150. "attr0" : 0
  151. }
  152. "eventTime" : "2014-11-02T09:39:45.618-08:00"
  153. }'
  154. $ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \
  155. -H "Content-Type: application/json" \
  156. -d '{
  157. "event" : "$set",
  158. "entityType" : "user",
  159. "entityId" : "u1",
  160. "properties" : {
  161. "attr1" : 1,
  162. "attr2": 0
  163. }
  164. "eventTime" : "2014-11-02T09:39:45.618-08:00"
  165. }'
  166. $ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \
  167. -H "Content-Type: application/json" \
  168. -d '{
  169. "event" : "$set",
  170. "entityType" : "user",
  171. "entityId" : "u1",
  172. "properties" : {
  173. "plan" : 1
  174. }
  175. "eventTime" : "2014-11-02T09:39:45.618-08:00"
  176. }'
  177. ```
  178. </div>
  179. <div data-tab="Python SDK" data-lang="python">
  180. ```python
  181. # You may also set the properties one by one
  182. client.create_event(
  183. event="$set",
  184. entity_type="user",
  185. entity_id=<USER ID>,
  186. properties= {
  187. "attr0" : int(<VALUE OF ATTR0>)
  188. }
  189. )
  190. client.create_event(
  191. event="$set",
  192. entity_type="user",
  193. entity_id=<USER ID>,
  194. properties= {
  195. "attr1" : int(<VALUE OF ATTR1>),
  196. "attr2" : int(<VALUE OF ATTR2>)
  197. }
  198. )
  199. client.create_event(
  200. event="$set",
  201. entity_type="user",
  202. entity_id=<USER ID>,
  203. properties= {
  204. "plan" : int(<VALUE OF PLAN>)
  205. }
  206. )
  207. ```
  208. </div>
  209. <div data-tab="PHP SDK" data-lang="php">
  210. ```php
  211. <?php
  212. // You may also set the properties one by one
  213. $client->createEvent(array(
  214. 'event' => '$set',
  215. 'entityType' => 'user',
  216. 'entityId' => <USER ID>,
  217. 'properties' => array(
  218. 'attr0' => <VALUE OF ATTR0>
  219. )
  220. ));
  221. $client->createEvent(array(
  222. 'event' => '$set',
  223. 'entityType' => 'user',
  224. 'entityId' => <USER ID>,
  225. 'properties' => array(
  226. 'attr1' => <VALUE OF ATTR1>,
  227. 'attr2' => <VALUE OF ATTR2>
  228. )
  229. ));
  230. $client->createEvent(array(
  231. 'event' => '$set',
  232. 'entityType' => 'user',
  233. 'entityId' => <USER ID>,
  234. 'properties' => array(
  235. 'plan' => <VALUE OF PLAN>
  236. )
  237. ));
  238. ?>
  239. ```
  240. </div>
  241. <div data-tab="Ruby SDK" data-lang="ruby">
  242. ```ruby
  243. # You may also set the properties one by one.
  244. client.create_event(
  245. '$set',
  246. 'user',
  247. <USER ID>, {
  248. 'properties' => {
  249. 'attr0' => <VALUE OF ATTR0 (integer)>
  250. }
  251. }
  252. )
  253. client.create_event(
  254. '$set',
  255. 'user',
  256. <USER ID>, {
  257. 'properties' => {
  258. 'attr1' => <VALUE OF ATTR1 (integer)>,
  259. }
  260. }
  261. )
  262. # Etc...
  263. ```
  264. </div>
  265. <div data-tab="Java SDK" data-lang="java">
  266. ```java
  267. // you may also set the properties one by one
  268. client.createEvent(new Event()
  269. .event("$set")
  270. .entityType("user")
  271. .entityId(<USER ID>)
  272. .property("attr0", <VALUE OF ATTR0>));
  273. client.createEvent(new Event()
  274. .event("$set")
  275. .entityType("user")
  276. .entityId(<USER ID>)
  277. .property("attr1", <VALUE OF ATTR1>)
  278. .property("attr2", <VALUE OF ATTR2>));
  279. client.createEvent(new Event()
  280. .event("$set")
  281. .entityType("user")
  282. .entityId(<USER ID>)
  283. .property("plan", <VALUE OF PLAN>));
  284. ```
  285. </div>
  286. </div>
  287. The properties of the `user` can be set, unset, or delete by special events **$set**, **$unset** and **$delete**. Please refer to [Event API](/datacollection/eventapi/#note-about-properties) for more details of using these events.
  288. <%= partial 'shared/quickstart/query_eventserver' %>
  289. ### Import More Sample Data
  290. <%= partial 'shared/quickstart/import_sample_data' %>
  291. A Python import script `import_eventserver.py` is provided to import the data to
  292. Event Server using Python SDK. Please upgrade to the latest Python SDK.
  293. <%= partial 'shared/quickstart/install_python_sdk' %>
  294. Make sure you are under the `MyClassification` directory. Replace the value of access_key parameter by your **Access Key** and run:
  295. ```
  296. $ cd MyClassification
  297. $ python data/import_eventserver.py --access_key obbiTuSOiMzyFKsvjjkDnWk1vcaHjcjrv9oT3mtN3y6fOlpJoVH459O1bPmDzCdv
  298. ```
  299. You should see the following output:
  300. ```
  301. Importing data...
  302. 6 events are imported.
  303. ```
  304. Now the training data is stored as events inside the Event Store.
  305. <%= partial 'shared/quickstart/query_eventserver_short' %>
  306. ## 5. Deploy the Engine as a Service
  307. <%= partial 'shared/quickstart/deploy_enginejson', locals: { engine_name: 'MyClassification' } %>
  308. <%= partial 'shared/quickstart/deploy', locals: { engine_name: 'MyClassification' } %>
  309. ## 6. Use the Engine
  310. Now, You can try to retrieve predicted results. For example, to predict the
  311. label (i.e. *plan* in this case) of a user with attr0=2, attr1=0 and attr2=0,
  312. you send this JSON `{ "attr0":2, "attr1":0, "attr2":0 }` to the deployed engine and it will
  313. return a JSON of the predicted plan. Simply send a query by making a HTTP
  314. request or through the `EngineClient` of an SDK.
  315. With the deployed engine running, open another temrinal and run the following `curl` command or use SDK to send the query:
  316. <div class="tabs">
  317. <div data-tab="REST API" data-lang="bash">
  318. ```bash
  319. $ curl -H "Content-Type: application/json" \
  320. -d '{ "attr0":2, "attr1":0, "attr2":0 }' http://localhost:8000/queries.json
  321. ```
  322. </div>
  323. <div data-tab="Python SDK" data-lang="python">
  324. ```python
  325. import predictionio
  326. engine_client = predictionio.EngineClient(url="http://localhost:8000")
  327. print engine_client.send_query({"attr0":2, "attr1":0, "attr2":0})
  328. ```
  329. </div>
  330. <div data-tab="PHP SDK" data-lang="php">
  331. ```php
  332. <?php
  333. require_once("vendor/autoload.php");
  334. use predictionio\EngineClient;
  335. $client = new EngineClient('http://localhost:8000');
  336. $response = $client->sendQuery(array('attr0'=> 2, 'attr1' => 0, 'attr2' => 0));
  337. print_r($response);
  338. ?>
  339. ```
  340. </div>
  341. <div data-tab="Ruby SDK" data-lang="ruby">
  342. ```ruby
  343. # Create client object.
  344. client = PredictionIO::EngineClient.new(<ENGINE DEPLOY URL>)
  345. # Query PredictionIO.
  346. response = client.send_query('attr0' => 2, 'attr1' => 0, 'attr2' => 0)
  347. puts response
  348. ```
  349. </div>
  350. <div data-tab="Java SDK" data-lang="java">
  351. ```java
  352. import com.google.common.collect.ImmutableList;
  353. import com.google.common.collect.ImmutableMap;
  354. import com.google.gson.JsonObject;
  355. import org.apache.predictionio.EngineClient;
  356. EngineClient engineClient = new EngineClient(<ENGINE DEPLOY URL>);
  357. JsonObject response = engineClient.sendQuery(ImmutableMap.<String, Object>of(
  358. "attr0", 2,
  359. "attr1", 0,
  360. "attr2", 0
  361. ));
  362. ```
  363. </div>
  364. </div>
  365. WARNING: The Query format is changed since version v0.3.1. If you are using old Classification template version v0.3.0 or earlier, the query format is array of feature values instead: `{ "features": [2, 0, 0] } `.
  366. The following is sample JSON response:
  367. ```
  368. {"label":0.0}
  369. ```
  370. Similarly, to predict the label (i.e. *plan* in this case) of a user with
  371. attr0=4, attr1=3 and attr2=8, you send this JSON `{ "attr0": 4, "attr1": 3, "attr2": 8] }` to
  372. the deployed engine and it will return a JSON of the predicted plan.
  373. WARNING: For classification template version v0.3.0 or earlier, the query JSON would be `{ "features": [4, 3, 8] }`.
  374. *MyClassification* is now running.
  375. <%= partial 'shared/quickstart/production' %>
  376. Next, we are going to take a look at the engine
  377. architecture and explain how you can customize it completely.
  378. #### [Next: DASE Components Explained](/templates/classification/dase/)