/lang_php/analyze/checker/check_variables_php.ml
OCaml | 961 lines | 481 code | 98 blank | 382 comment | 11 complexity | 776f8c583cdcbec389011e68d3ce6d2f MD5 | raw file
1(* Yoann Padioleau 2 * 3 * Copyright (C) 2010, 2012 Facebook 4 * 5 * This library is free software; you can redistribute it and/or 6 * modify it under the terms of the GNU Lesser General Public License 7 * version 2.1 as published by the Free Software Foundation, with the 8 * special exception on linking described in file license.txt. 9 * 10 * This library is distributed in the hope that it will be useful, but 11 * WITHOUT ANY WARRANTY; without even the implied warranty of 12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the file 13 * license.txt for more details. 14 *) 15open Common 16 17open Ast_php_simple 18module A = Ast_php_simple 19module E = Error_php 20module S = Scope_code 21module Ent = Entity_code 22module PI = Parse_info 23 24(*****************************************************************************) 25(* Prelude *) 26(*****************************************************************************) 27(* 28 * This module helps find stupid PHP mistakes related to variables. See 29 * tests/php/scheck/variables.php for examples of bugs currently 30 * detected by this checker. This module not only checks but also annotates 31 * the AST with scoping information as a side effect. This is useful 32 * in codemap to display differently references to parameters, local vars, 33 * global vars, etc. 34 * 35 * This file mostly deals with scoping issues. Scoping is different 36 * from typing! Those are two orthogonal programming language concepts. 37 * old: This file is concerned with variables, that is Ast_php.dname 38 * entities, so for completness C-s for dname in ast_php.ml and 39 * see if all uses of it are covered. Other files are more concerned 40 * about checks related to entities, that is Ast_php.name. 41 * 42 * The errors detected here are mostly: 43 * - UseOfUndefinedVariable 44 * - UnusedVariable 45 * Some similar checks are done by JSlint. 46 * 47 * Some issues: 48 * - detecting variable-related mistakes is made slightly more complicated 49 * by PHP because of the lack of declaration in the language; 50 * the first assignment "declares" the variable (on the other side 51 * the PHP language forces people to explicitly declared 52 * the use of global variables (via the 'global' statement) which 53 * makes certain things easier). 54 * 55 * - variables passed by reference can look like UseOfUndefinedVariable 56 * bugs but they are not. One way to fix it is to do a global analysis that 57 * remembers what are all the functions taking arguments by reference 58 * and whitelist them here. But it has a cost. One can optimize a little 59 * this by using an entity_finder computed semi lazily a la cmf multi 60 * level approach (recursive a la cpp, flib-map, git-grep, git head). 61 * Another way is to force programmers to actually declare such variables 62 * before those kinds of function calls (this is what Evan advocated). 63 * 64 * - people abuse assignements in function calls to emulate "keyword arguments" 65 * as in 'foo($color = "red", $width = 10)'. Such assignments looks 66 * like UnusedVariable but they are not. One can fix that by detecting 67 * such code patterns. 68 * 69 * - functions like extract(), param_get(), param_post() 70 * or variable variables like $$x introduce some false positives. 71 * Regarding the param_get/param_post(), one way to fix it is to just 72 * not analyse toplevel code. Another solution is to hardcode a few 73 * analysis that recognizes the arguments of those functions. Finally 74 * for the extract() and $$x one can just bailout of such code or 75 * as Evan did remember the first line where such code happen and 76 * don't do any analysis pass this point. 77 * 78 * - any analysis will probably flag lots of warnings on an existing PHP 79 * codebase. Some programmers may find legitimate certain things, 80 * for instance having variables declared in a foreach to escape its 81 * foreach scope. This would then hinder the whole analysis because 82 * people would just not run the analysis. You need the approval of 83 * the PHP developers on such analysis first and get them ok to change 84 * their coding styles rules. One way to fix this problem is to have 85 * a strict mode where only certain checks are enabled. A good 86 * alternative is also to rank errors. A final trick is to report 87 * only new errors. 88 * 89 * Here are some extra notes by Evan in his own variable linter: 90 * 91 * "These things declare variables in a function": 92 * - DONE Explicit parameters 93 * - DONE Assignment (pad: variable mentionned for the first time) 94 * - DONE Assignment via list() 95 * - DONE Static, Global 96 * - DONE foreach() 97 * - DONE catch 98 * - DONE Builtins ($this) 99 * - DONE Lexical vars, in php 5.3 lambda expressions 100 * pad: forgot pass by reference variables 101 * 102 * "These things make lexical scope unknowable": 103 * - DONE Use of extract() 104 * - DONE Assignment or Global with variable variables ($$x) 105 * (pad: so just bailout on such code) 106 * pad: forgot eval()? but eval can introduce new variables in scope? 107 * 108 * These things don't count as "using" a variable: 109 * - DONE isset() (pad: this should be forbidden, it's a bad way to program) 110 * - DONE empty() 111 * - DONE Static class variables (pad: check done in check_classes instead) 112 * 113 * Here are a few additional checks and features of this checker: 114 * - when the strict_scope flag is set, check_variables will 115 * emulate a block-scoped language as in JSLint and flags 116 * variables used outside their "block". 117 * - when passed the find_entity hook, check_variables will 118 * know about functions taking parameters by refs, which removes 119 * some false positives 120 * 121 * history: 122 * - sgrimm had the simple idea of just counting the number of occurences 123 * of a variable in a program, at the token level. If only 1, then 124 * probably a typo. But sometimes variable names are mentionned in 125 * interface signatures in which case they occur only once. So you need 126 * some basic analysis; the token level is not enough. You may not 127 * need the CFG but at least you need the AST to differentiate the 128 * different kinds of unused variables. 129 * - Some of the scoping logic was previously in another file, scoping_php.ml 130 * but we were kind of duplicating the logic that is in scoping_php.ml. 131 * PHP has bad scoping rule allowing variables declared through a foreach 132 * to be used outside the foreach, which is probably wrong. 133 * Unfortunately, knowing from scoping_php.ml just the status of a 134 * variable (local, global, or param) is not good enough to find bugs 135 * related to those weird scoping rules. So I've put all variable scope 136 * related stuff in this file and removed the duplication in scoping_php.ml. 137 * - I was using ast_php.ml and a visitor approach but then I rewrote it 138 * to use ast_php_simple and an "env" approach because the code was 139 * getting ugly and was containing false positives that were hard to fix. 140 * As a side effect of the refactoring, some bugs disappeared (nested 141 * assigns in if, list() not at the toplevel of an expression, undefined 142 * access to an array), and code regarding lexical variables became 143 * more clear because localized in one place. 144 * 145 * TODO OTHER: 146 * - factorize code for the shared_ref thing and create_new_local_if_necessary 147 * - the old checker was handling correctly globals? was it looking up 148 * in the top scope? add some unit tests. 149 * - put back strict block scope 150 * - factorize in add_var() that adds in env.vars and 151 * update env.scoped_vars_used too. Removed refs to scope_ref in this file 152 *) 153 154(*****************************************************************************) 155(* Types, constants *) 156(*****************************************************************************) 157type env = { 158 (* The ref in the tuple is to remember the number of uses of the variable, 159 * for the UnusedVariable check. 160 * The ref for the Map.t is to avoid threading the env, because 161 * any stmt/expression can introduce new variables. 162 * 163 * todo? use a list of Map.t to represent nested scopes? 164 * (globals, methods/functions, nested blocks)? when in strict/block mode? 165 *) 166 vars: (string, (Ast_php.tok * Scope_code.scope * int ref)) Map_poly.t ref; 167 168 (* to remember how to annotate Var in ast_php.ml *) 169 scope_vars_used: (Ast_php.tok, Scope_code.scope) Hashtbl.t; 170 171 (* todo: have a globals:? *) 172 173 (* we need to access the definitions of functions/methods to know 174 * if an argument was passed by reference, in which case what looks 175 * like a UseOfUndefinedVariable is actually not (but it would be 176 * better for them to fix the code to introduce/declare this variable 177 * before ...). 178 *) 179 db: Entity_php.entity_finder option; 180 (* when analyze $this->method_call(), we need to know the enclosing class *) 181 in_class: ident option; 182 183 (* for better error message when the variable was inside a long lambda *) 184 in_long_lambda: bool; 185 186 (* when the body of a function contains eval/extract/... we bailout 187 * because we don't want to report false positives 188 *) 189 bailout: bool ref; 190} 191 192(*****************************************************************************) 193(* Helpers *) 194(*****************************************************************************) 195 196let unused_ok s = 197 s =~ "\\$_.*" || 198 s =~ "\\$ignore.*" || 199 List.mem s ["$unused";"$dummy";"$guard"] || 200 (if !E.strict 201 then false 202 else 203 List.mem s [ 204 "$res"; "$retval"; "$success"; "$is_error"; "$rs"; "$ret"; 205 "$e"; "$ex"; "$exn"; (* exception *) 206 ] 207 ) 208 209let lookup_opt s vars = 210 Common2.optionise (fun () -> Map_poly.find s vars) 211 212let s_tok_of_ident name = 213 A.str_of_ident name, A.tok_of_ident name 214 215(* to help debug *) 216let str_of_any any = 217 let v = Meta_ast_php_simple.vof_any any in 218 Ocaml.string_of_v v 219 220(*****************************************************************************) 221(* Vars passed by ref *) 222(*****************************************************************************) 223(* 224 * Detecting variables passed by reference is complicated in PHP because 225 * one does not have to use &$var at the call site (one can though). This is 226 * ugly. So to detect variables passed by reference, we need to look at 227 * the definition of the function/method called, hence the need for a 228 * entity_finder in env.db. 229 * 230 * note that it currently returns an Ast_php.func_def, not 231 * an Ast_php_simple.func_def because the database currently 232 * stores concrete ASTs, not simple ASTs. 233 *) 234let funcdef_of_call_or_new_opt env e = 235 match env.db with 236 | None -> None 237 | Some find_entity -> 238 (match e with 239 | Id [name] -> 240 (* simple function call *) 241 let (s, tok) = s_tok_of_ident name in 242 (match find_entity (Ent.Function, s) with 243 | [Ast_php.FunctionE def] -> Some def 244 (* normally those errors should be triggered in 245 * check_functions_php.ml, but right now this file uses 246 * ast_php.ml and not ast_php_simple.ml, so there are some 247 * differences in the logic so we double check things here. 248 *) 249 | [] -> 250 E.fatal tok (E.UndefinedEntity (Ent.Function, s)); 251 None 252 | _x::_y::_xs -> 253 E.fatal tok (E.MultiDefinedEntity (Ent.Function, s, ("",""))); 254 None 255 | _ -> raise Impossible 256 ) 257 (* dynamic function call *) 258 | Var _ -> None 259 260 (* static method call *) 261 | Class_get (Id name1, Id name2) -> 262 (* todo: name1 can be self/parent in traits, or static: *) 263 let aclass = A.str_of_name name1 in 264 let amethod = A.str_of_name name2 in 265 (try 266 Some (Class_php.lookup_method ~case_insensitive:true 267 (aclass, amethod) find_entity) 268 (* could not find the method, this is bad, but 269 * it's not our business here; this error will 270 * be reported anyway in check_classes_php.ml anyway 271 *) 272 with 273 | Not_found | Multi_found 274 | Class_php.Use__Call|Class_php.UndefinedClassWhileLookup _ -> 275 None 276 ) 277 (* simple object call. If 'this->...()' then we can use lookup_method. 278 * Being complete and handling any method calls like $o->...() 279 * requires to know what is the type of $o which is quite 280 * complicated ... so let's skip that for now. 281 * 282 * todo: special case also id(new ...)-> ? 283 *) 284 | Obj_get (This _, Id [name2]) -> 285 (match env.in_class with 286 | Some name1 -> 287 let aclass = A.str_of_ident name1 in 288 let amethod = A.str_of_ident name2 in 289 (try 290 Some (Class_php.lookup_method ~case_insensitive:true 291 (aclass, amethod) find_entity) 292 with 293 | Not_found | Multi_found 294 | Class_php.Use__Call|Class_php.UndefinedClassWhileLookup _ -> 295 None 296 ) 297 (* wtf? use of $this outside a class? *) 298 | None -> None 299 ) 300 301 (* dynamic call, not much we can do ... *) 302 | _ -> None 303 ) 304 305(*****************************************************************************) 306(* Checks *) 307(*****************************************************************************) 308 309let check_defined env name ~incr_count = 310 let (s, tok) = s_tok_of_ident name in 311 match lookup_opt s !(env.vars) with 312 | None -> 313 (* todo? could still issue an error but with the information that 314 * there was an extract/eval/... around? 315 *) 316 if !(env.bailout) 317 then () 318 else 319 let err = 320 if env.in_long_lambda 321 then E.UseOfUndefinedVariableInLambda s 322 else 323 let allvars = 324 !(env.vars) +> Map_poly.to_list +> List.map fst in 325 let suggest = Suggest_fix_php.suggest s allvars in 326 E.UseOfUndefinedVariable (s, suggest) 327 in 328 E.fatal (A.tok_of_ident name) err 329 330 | Some (_tok, scope, access_count) -> 331 Hashtbl.add env.scope_vars_used tok scope; 332 if incr_count then incr access_count 333 334let check_used env vars = 335 vars +> Map_poly.iter (fun s (tok, scope, aref) -> 336 if !aref = 0 337 then 338 if unused_ok s 339 then () 340 else 341 (* if you use compact(), some variables may look unused but 342 * they can actually be used. See variables_fp.php. 343 *) 344 if !(env.bailout) 345 then () 346 else E.fatal tok (E.UnusedVariable (s, scope)) 347 ) 348 349let create_new_local_if_necessary ~incr_count env name = 350 let (s, tok) = s_tok_of_ident name in 351 match lookup_opt s !(env.vars) with 352 (* new local variable implicit declaration. 353 * todo: add in which nested scope? I would argue to add it 354 * only in the current nested scope. If someone wants to use a 355 * var outside the block, he should have initialized the var 356 * in the outer context. Jslint does the same. 357 *) 358 | None -> 359 env.vars := Map_poly.add s (tok, S.Local, ref 0) !(env.vars); 360 Hashtbl.add env.scope_vars_used tok S.Local; 361 362 | Some (_tok, scope, access_count) -> 363 Hashtbl.add env.scope_vars_used tok scope; 364 if incr_count then incr access_count 365 366(*****************************************************************************) 367(* Main entry point *) 368(*****************************************************************************) 369 370(* For each introduced variable (parameter, foreach variable, exception, etc), 371 * we add the binding in the environment with a counter, a la checkModule. 372 * We then check at use-time if something was declared before. We then 373 * finally check when we exit a scope that all variables were actually used. 374 *) 375let rec program env prog = 376 List.iter (stmt env) prog; 377 378 (* we must check if people used the variables declared at the toplevel 379 * context or via the param_post/param_get calls. 380 * todo: check env.globals instead? 381 *) 382 check_used env !(env.vars) 383 384(* ---------------------------------------------------------------------- *) 385(* Functions/Methods *) 386(* ---------------------------------------------------------------------- *) 387and func_def env def = 388 389 (* should not contain variables anyway, but does not hurt to check *) 390 def.f_params +> List.iter (fun p -> Common.opt (expr env) p.p_default); 391 392 let access_cnt = 393 match def.f_kind with 394 | Function | AnonLambda | ShortLambda -> 0 395 (* Don't report UnusedParameter for parameters of methods; 396 * people sometimes override a method and don't use all 397 * the parameters, hence the 1 value below. 398 * 399 * less: we we don't want parameters just in method interface 400 * to not be counted as unused Parameter 401 * less: one day we will have an @override annotation in which 402 * case we can reconsider the above design decision. 403 *) 404 | Method -> 1 405 in 406 let oldvars = !(env.vars) in 407 408 let enclosing_vars = 409 match def.f_kind with 410 (* for ShortLambda enclosing variables can be accessed *) 411 | ShortLambda -> Map_poly.to_list oldvars 412 (* fresh new scope *) 413 | _ -> 414 (Env_php.globals_builtins +> List.map (fun s -> 415 "$" ^ s, (Ast_php.fakeInfo s, S.Global, ref 1) 416 )) @ 417 (* $this is now implicitly passed in use() for closures *) 418 (try ["$this", Map_poly.find "$this" oldvars] 419 with Not_found -> [] 420 ) 421 in 422 423 let env = { env with 424 vars = ref (( 425 (def.f_params +> List.map (fun p -> 426 let (s, tok) = s_tok_of_ident p.p_name in 427 s, (tok, S.Param, ref access_cnt) 428 )) @ 429 enclosing_vars 430 ) +> Map_poly.of_list); 431 432 (* reinitialize bailout for each function/method *) 433 bailout = ref false; 434 } 435 in 436 def.l_uses +> List.iter (fun (_is_ref, name) -> 437 let (s, tok) = s_tok_of_ident name in 438 check_defined ~incr_count:true { env with vars = ref oldvars} name; 439 (* don't reuse same access count reference; the variable has to be used 440 * again in this new scope. 441 *) 442 env.vars := Map_poly.add s (tok, S.Closed, ref 0) !(env.vars); 443 ); 444 (* We put 1 as 'use_count' below because we are not interested 445 * in error message related to $this. It's ok to not use $this 446 * in a method. 447 *) 448 if def.f_kind = Method && not (A.is_static def.m_modifiers) 449 then begin 450 let tok = (Ast_php.fakeInfo "$this") in 451 env.vars := Map_poly.add "$this" (tok, S.Class, ref 1) !(env.vars); 452 end; 453 454 List.iter (stmt env) def.f_body; 455 let newvars = 456 enclosing_vars +> List.fold_left (fun acc (x, _) -> Map_poly.remove x acc) 457 !(env.vars) 458 in 459 check_used env newvars 460 461(* ---------------------------------------------------------------------- *) 462(* Stmt *) 463(* ---------------------------------------------------------------------- *) 464and stmt env = function 465 | FuncDef def -> func_def env def 466 | ClassDef def -> class_def env def 467 | ConstantDef def -> constant_def env def 468 | TypeDef def -> typedef_def env def 469 | NamespaceDef (qu, _) | NamespaceUse (qu, _) -> 470 raise (Ast_php.TodoNamespace (A.tok_of_name qu)) 471 472 | Expr e -> expr env e 473 (* todo: block scope checking when in strict mode? *) 474 | Block xs -> stmtl env xs 475 476 | If (e, st1, st2) -> 477 expr env e; 478 stmtl env [st1;st2] 479 480 | Switch (e, xs) -> 481 expr env e; 482 casel env xs 483 484 | While (e, xs) -> 485 expr env e; 486 stmtl env xs 487 | Do (xs, e) -> 488 stmtl env xs; 489 expr env e 490 | For (es1, es2, es3, xs) -> 491 exprl env (es1 @ es2 @ es3); 492 stmtl env xs 493 494 | Foreach (e1, pattern, xs) -> 495 expr env e1; 496 foreach_pattern env pattern; 497 stmtl env xs 498 499 | Return eopt 500 | Break eopt | Continue eopt -> 501 Common.opt (expr env) eopt 502 503 | Throw e -> expr env e 504 | Try (xs, cs, fs) -> 505 stmtl env xs; 506 catches env (cs); 507 finallys env (fs) 508 509 | StaticVars xs -> 510 xs +> List.iter (fun (name, eopt) -> 511 Common.opt (expr env) eopt; 512 let (s, tok) = s_tok_of_ident name in 513 (* less: check if shadows something? *) 514 env.vars := Map_poly.add s (tok, S.Static, ref 0) !(env.vars); 515 ) 516 | Global xs -> 517 xs +> List.iter (fun e -> 518 (* should be an Var most of the time. 519 * todo: should check in .globals that this variable actually exists 520 *) 521 match e with 522 | Var name -> 523 let (s, tok) = s_tok_of_ident name in 524 env.vars := Map_poly.add s (tok, S.Global, ref 0) !(env.vars); 525 (* todo: E.warning tok E.UglyGlobalDynamic *) 526 | _ -> 527 pr2 (str_of_any (Expr2 e)); 528 raise Todo 529 ) 530 531(* The scope of catch is actually also at the function level in PHP ... 532 * 533 * todo: but for this one it is so ugly that I introduce a new scope 534 * even outside strict mode. It's just too ugly. 535 * todo: check unused 536 * todo? could use local ? could have a UnusedExceptionParameter ? 537 * less: could use ref 1, the exception is often not used 538 *) 539and catch env (_hint_type, name, xs) = 540 let (s, tok) = s_tok_of_ident name in 541 env.vars := Map_poly.add s (tok, S.LocalExn, ref 0) !(env.vars); 542 stmtl env xs 543 544and finally env (xs) = 545 stmtl env xs 546 547and case env = function 548 | Case (e, xs) -> 549 expr env e; 550 stmtl env xs 551 | Default xs -> 552 stmtl env xs 553 554and stmtl env xs = List.iter (stmt env) xs 555and casel env xs = List.iter (case env) xs 556and catches env xs = List.iter (catch env) xs 557and finallys env xs = List.iter (finally env) xs 558 559and foreach_pattern env pattern = 560 561 (* People often use only one of the iterator when 562 * they do foreach like foreach(... as $k => $v). 563 * We want to make sure that at least one of 564 * the iterator variables is used, hence this trick to 565 * make them share the same access count reference. 566 *) 567 let shared_ref = ref 0 in 568 569 let rec aux e = 570 (* look Ast_php.foreach_pattern to see the kinds of constructs allowed *) 571 match e with 572 | Var name -> 573 let (s, tok) = s_tok_of_ident name in 574 (* todo: if already in scope? shadowing? *) 575 (* todo: if strict then introduce new scope here *) 576 (* todo: scope_ref := S.LocalIterator; *) 577 env.vars := Map_poly.add s (tok, S.LocalIterator, shared_ref) !(env.vars) 578 | Ref x -> aux x 579 | Arrow (e1, e2) -> aux e1; aux e2 580 | List xs -> List.iter aux xs 581 (* other kinds of lvalue are permitted too, but it's a little bit wierd 582 * and very rarely used in www 583 *) 584 | Array_get (e, eopt) -> 585 aux e; 586 eopt +> Common.do_option (expr env) 587 (* todo: E.warning tok E.WeirdForeachNoIteratorVar *) 588 | _ -> 589 failwith ("Warning: unexpected `foreach` value " ^ 590 (str_of_any (Expr2 pattern))) 591 in 592 aux pattern 593 594 595(* ---------------------------------------------------------------------- *) 596(* Expr *) 597(* ---------------------------------------------------------------------- *) 598and expr env e = 599 match e with 600 | Int _ | Double _ | String _ -> () 601 602 | Var name -> 603 check_defined ~incr_count:true env name 604 605 | Id _name -> () 606 607 | Assign (None, e1, e2) -> 608 (* e1 should be an lvalue *) 609 (match e1 with 610 | Var name -> 611 (* Does an assignation counts as a use? If you only 612 * assign and never use a variable what is the point? 613 * This should be legal only for parameters passed by reference. 614 * (note that here I talk about parameters, not arguments) 615 * 616 * TODO: hmm if you take a reference to something, then 617 * assigning something to it should be considered as a use too. 618 *) 619 create_new_local_if_necessary ~incr_count:false env name; 620 621 (* extract all vars, and share the same reference *) 622 | List xs -> 623 (* Use the same trick than for LocalIterator *) 624 let shared_ref = ref 0 in 625 626 let rec aux = function 627 (* should be an lvalue again *) 628 | Var name -> 629 let (s, tok) = s_tok_of_ident name in 630 (match lookup_opt s !(env.vars) with 631 | None -> 632 env.vars := Map_poly.add s (tok, S.ListBinded, shared_ref) 633 !(env.vars); 634 | Some (_tok, _scope, _access_cnt) -> 635 () 636 ) 637 | ((Array_get _ | Obj_get _ | Class_get _) as e) -> 638 expr env e 639 | List xs -> List.iter aux xs 640 | _ -> 641 pr2 (str_of_any (Expr2 (List xs))); 642 raise Todo 643 in 644 List.iter aux xs 645 646 | Array_get (e_arr, e_opt) -> 647 (* make sure the array is declared *) 648 expr env e_arr; 649 Common.opt (expr env) e_opt 650 651 (* checks for use of undefined member should be in check_classes *) 652 | Obj_get (_, _) | Class_get (_, _) -> 653 (* just recurse on the whole thing so the code for Obj_get/Class_get 654 * below will be triggered 655 *) 656 expr env e1 657 658 | Call (Id[(("__builtin__eval_var", _) as name)], _args) -> 659 env.bailout := true; 660 E.warning (tok_of_ident name) E.DynamicCode 661 662 (* can we have another kind of lvalue? *) 663 | e -> 664 pr2 (str_of_any (Expr2 e)); 665 failwith "WrongLvalue" 666 ); 667 expr env e2 668 669 | Assign (Some _, e1, e2) -> 670 exprl env [e1;e2] 671 672 | List _xs -> 673 let tok = Meta_ast_php_simple.toks_of_any (Expr2 e) +> List.hd in 674 failwith (spf "list(...) should be used only in an Assign context at %s" 675 (PI.string_of_info tok)) 676 (* Arrow used to be allowed only in Array and Foreach context, but now 677 * can we also have code like yield $k => $v, so this is really a pair. 678 *) 679 | Arrow (e1, e2) -> 680 exprl env [e1; e2] 681 682 (* A mention of a variable in a unset() should not be really 683 * considered as a use of variable. There should be another 684 * statement in the function that actually uses the variable. 685 *) 686 | Call (Id[ ("__builtin__unset", _tok)], args) -> 687 args +> List.iter (function 688 (* should be an lvalue again *) 689 (* less: The use of 'unset' on a variable is still not clear to me. *) 690 | Var name -> 691 check_defined ~incr_count:false env name 692 (* Unsetting a field, seems like a valid use. 693 * Unsetting a prop, not clear why you want that. 694 * Unsetting a class var, not clear why you want that either 695 *) 696 | (Array_get (_, _) | Obj_get _ | Class_get _) as e -> 697 (* make sure that the array used is actually defined *) 698 expr env e 699 | e -> 700 pr2 (str_of_any (Expr2 e)); 701 raise Todo 702 ) 703 (* special case, could factorize maybe with pass_var_by_ref *) 704 | Call (Id[ ("sscanf", _tok)], x::y::vars) -> 705 (* what if no x and y? wrong number of arguments, not our business here*) 706 expr env x; 707 expr env y; 708 vars +> List.iter (function 709 | Var name -> 710 create_new_local_if_necessary ~incr_count:false env name 711 (* less: wrong, it should be a variable? *) 712 | e -> expr env e 713 ) 714 715 (* The right fix is to forbid people to use isset/empty or unset on var. 716 * todo: could have if(isset($x)) { ... code using $x}. 717 * maybe we should have a bailout_vars and skip further errors on $x. 718 * todo: could have isset(Array_get(...) there too no? 719 *) 720 | Call (Id[ ("__builtin__isset", _tok)], [Var _name]) -> 721 () 722 (* http://php.net/manual/en/function.empty.php 723 * "empty() does not generate a warning if the variable does not exist." 724 *) 725 | Call (Id[ ("__builtin__empty", _tok)], [Var _name]) -> 726 () 727 728 729 | Call (Id[ ((("__builtin__eval" | "__builtin__eval_var" | 730 "extract" | "compact" 731 ), _) as name)], _args) -> 732 env.bailout := true; 733 E.warning (tok_of_ident name) E.DynamicCode 734 735 (* facebook specific? should be a hook instead to visit_prog? *) 736 | Call(Id[("param_post"|"param_get"|"param_request"|"param_cookie"as kind,_)], 737 (ConsArray (array_args))::rest_param_xxx_args) -> 738 739 (* have passed a 'prefix' arg, or nothing *) 740 if List.length rest_param_xxx_args <= 1 741 then begin 742 let prefix_opt = 743 match rest_param_xxx_args with 744 | [String(str_prefix, _tok_prefix)] -> 745 Some str_prefix 746 | [] -> 747 (match kind with 748 | "param_post" -> Some "post_" 749 | "param_get" -> Some "get_" 750 | "param_request" -> Some "req_" 751 | "param_cookie" -> Some "cookie_" 752 | _ -> raise Impossible 753 ) 754 | _ -> 755 (* less: display an error? weird argument to param_xxx func?*) 756 None 757 in 758 prefix_opt +> Common.do_option (fun prefix -> 759 array_args +> List.iter (function 760 | Arrow(String(param_string, tok_param), _typ_param) -> 761 let s = "$" ^ prefix ^ param_string in 762 let tok = A.tok_of_ident (param_string, tok_param) in 763 env.vars := Map_poly.add s (tok, S.Local, ref 0) !(env.vars); 764 (* less: display an error? weird argument to param_xxx func? *) 765 | _ -> () 766 ) 767 ) 768 end 769 770 | Call (e, es) -> 771 expr env e; 772 773 (* getting the def for args passed by ref false positives fix *) 774 let def_opt = funcdef_of_call_or_new_opt env e in 775 let es_with_parameters = 776 match def_opt with 777 | None -> 778 es +> List.map (fun e -> e, None) 779 | Some def -> 780 let params = 781 def.Ast_php.f_params +> Ast_php.unparen +> Ast_php.uncomma_dots 782 in 783 let rec zip args params = 784 match args, params with 785 | [], [] -> [] 786 (* more params than arguments, maybe because default parameters *) 787 | [], _y::_ys -> [] 788 (* more arguments than params, maybe because func_get_args() *) 789 | x::xs, [] -> (x, None)::zip xs [] 790 | x::xs, y::ys -> (x, Some y)::zip xs ys 791 in 792 zip es params 793 in 794 795 es_with_parameters +> List.iter (fun (arg, param_opt) -> 796 match arg, param_opt with 797 (* keyword argument; do not consider this variable as unused. 798 * We consider this variable as a pure comment here and just pass over. 799 * todo: could make sure they are not defined in the current 800 * environment in strict mode? and if they are, shout because of 801 * bad practice? 802 *) 803 | Assign (None, Var _name, e2), _ -> 804 expr env e2 805 (* a variable passed by reference, this can considered a new decl *) 806 | Var name, Some {Ast_php.p_ref = Some _;_} -> 807 808 (* if was already assigned and passed by refs, 809 * increment its use counter then. 810 * less: or should we increase only if inout param? 811 *) 812 create_new_local_if_necessary ~incr_count:true env name 813 814 | _ -> expr env arg 815 ) 816 817 818 | This name -> 819 (* when we do use($this) in closures, we create a fresh $this variable 820 * with a refcount of 0, so we need to increment it here. 821 *) 822 check_defined ~incr_count:true env name 823 824 (* array used as an rvalue; the lvalue case should be handled in Assign. *) 825 | Array_get (e, eopt) -> 826 expr env e; 827 Common.opt (expr env) eopt 828 829 | Obj_get (e1, e2) -> 830 expr env e1; 831 (match e2 with 832 (* with 'echo $o->$v' we have a dynamic field, we need to visit 833 * e2 to mark $v as used at least. 834 *) 835 | Id _name -> () 836 | _ -> expr env e2 837 ) 838 839 | Class_get (e1, e2) -> 840 expr env e1; 841 (match e2 with 842 (* with 'echo A::$v' we should not issue a UseOfUndefinedVariable, 843 * check_classes_php.ml will handle this case. 844 *) 845 | Var _ -> () 846 | _ -> expr env e2 847 ) 848 849 | New (e, es) -> 850 expr env (Call (Class_get(e, Id[ ("__construct", None)]), es)) 851 852 | InstanceOf (e1, e2) -> exprl env [e1;e2] 853 854 | Infix (_, e) | Postfix (_, e) | Unop (_, e) -> expr env e 855 | Binop (_, e1, e2) -> exprl env [e1; e2] 856 | Guil xs -> exprl env xs 857 858 | Ref e | Unpack e -> expr env e 859 860 | ConsArray (xs) -> array_valuel env xs 861 | Collection (_n, xs) -> 862 array_valuel env xs 863 | Xhp x -> xml env x 864 865 | CondExpr (e1, e2, e3) -> exprl env [e1; e2; e3] 866 | Cast (_, e) -> expr env e 867 868 | Lambda def -> 869 let in_long_lambda = 870 match def.f_kind with ShortLambda -> false | _ -> true 871 in 872 func_def { env with in_long_lambda } def 873 874and array_value env x = 875 match x with 876 | Arrow (e1, e2) -> exprl env [e1; e2] 877 | e -> expr env e 878 879and xml env x = 880 x.xml_attrs +> List.iter (fun (_name, xhp_attr) -> expr env xhp_attr); 881 x.xml_body +> List.iter (xhp env) 882 883and xhp env = function 884 | XhpText _s -> () 885 | XhpExpr e -> expr env e 886 | XhpXml x -> xml env x 887 888and exprl env xs = List.iter (expr env) xs 889and array_valuel env xs = List.iter (array_value env) xs 890 891(* ---------------------------------------------------------------------- *) 892(* Misc *) 893(* ---------------------------------------------------------------------- *) 894 895(* checks for use of undefined members should be in check_classes, but 896 * it's hard to do locally. 897 * 898 * todo: inline the code of traits? 899 *) 900and class_def env def = 901 let env = { env with in_class = Some def.c_name } in 902 List.iter (constant_def env) def.c_constants; 903 List.iter (class_var env) def.c_variables; 904 List.iter (method_def env) def.c_methods 905 906(* cst_body should be a static scalar so there should not be any 907 * variable in it so in theory we don't need to check it ... doesn't 908 * hurt though, one day maybe this will change. 909 *) 910and constant_def env def = 911 Common.opt (expr env) def.cst_body 912 913(* type definitions do not contain any variables *) 914and typedef_def _env _def = 915 () 916 917and class_var env v = 918 Common.opt (expr env) v.cv_value 919 920and method_def env x = func_def env x 921 922(*****************************************************************************) 923(* Main entry point *) 924(*****************************************************************************) 925 926let check_and_annotate_program2 find_entity prog = 927 let env = { 928 (* less: should be a in globals field instead? *) 929 vars = ref (Env_php.globals_builtins +> List.map (fun s -> 930 "$" ^ s, (Ast_php.fakeInfo s, S.Global, ref 1) 931 ) +> Map_poly.of_list); 932 db = find_entity; 933 in_class = None; 934 in_long_lambda = false; 935 bailout = ref false; 936 scope_vars_used = Hashtbl.create 101; 937 } 938 in 939 let ast = Ast_php_simple_build.program_with_position_information prog in 940 program env ast; 941 942 (* annotating the scope of Var *) 943 (Ast_php.Program prog) +> 944 Visitor_php.mk_visitor { Visitor_php.default_visitor with 945 Visitor_php.kexpr = (fun (k, _) x -> 946 match x with 947 | Ast_php.IdVar (dname, aref) -> 948 let tok = Ast_php.info_of_dname dname in 949 (try 950 aref := Hashtbl.find env.scope_vars_used tok 951 (* keep NoScope *) 952 with Not_found -> () 953 ) 954 | _ -> k x 955 ); 956 }; 957 () 958 959let check_and_annotate_program a b = 960 Common.profile_code "Checker.variables" (fun () -> 961 check_and_annotate_program2 a b)