Should you guard against unexpected values from external APIs?












28















Lets say you are coding a function that takes input from an external API MyAPI.



That external API MyAPI has a contract that states it will return a string or a number.



Is it recommended to guard against things like null, undefined, boolean, etc. even though it's not part of the API of MyAPI? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?



I'm thinking in relation to the Robustness Principle.










share|improve this question




















  • 7





    What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?

    – Vincent Savard
    yesterday








  • 34





    If you're expecting them, then by definition they're not unexpected.

    – Mason Wheeler
    yesterday






  • 13





    Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like <!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>

    – immibis
    yesterday






  • 4





    What does "external API" mean? Is it still under your Control?

    – Deduplicator
    yesterday






  • 3





    Absolutely. expect nothing, expect anything.

    – S..
    16 hours ago
















28















Lets say you are coding a function that takes input from an external API MyAPI.



That external API MyAPI has a contract that states it will return a string or a number.



Is it recommended to guard against things like null, undefined, boolean, etc. even though it's not part of the API of MyAPI? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?



I'm thinking in relation to the Robustness Principle.










share|improve this question




















  • 7





    What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?

    – Vincent Savard
    yesterday








  • 34





    If you're expecting them, then by definition they're not unexpected.

    – Mason Wheeler
    yesterday






  • 13





    Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like <!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>

    – immibis
    yesterday






  • 4





    What does "external API" mean? Is it still under your Control?

    – Deduplicator
    yesterday






  • 3





    Absolutely. expect nothing, expect anything.

    – S..
    16 hours ago














28












28








28


2






Lets say you are coding a function that takes input from an external API MyAPI.



That external API MyAPI has a contract that states it will return a string or a number.



Is it recommended to guard against things like null, undefined, boolean, etc. even though it's not part of the API of MyAPI? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?



I'm thinking in relation to the Robustness Principle.










share|improve this question
















Lets say you are coding a function that takes input from an external API MyAPI.



That external API MyAPI has a contract that states it will return a string or a number.



Is it recommended to guard against things like null, undefined, boolean, etc. even though it's not part of the API of MyAPI? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?



I'm thinking in relation to the Robustness Principle.







design api api-design web-services functions






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 57 mins ago









jpmc26

3,84921735




3,84921735










asked yesterday









Adam ThompsonAdam Thompson

35239




35239








  • 7





    What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?

    – Vincent Savard
    yesterday








  • 34





    If you're expecting them, then by definition they're not unexpected.

    – Mason Wheeler
    yesterday






  • 13





    Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like <!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>

    – immibis
    yesterday






  • 4





    What does "external API" mean? Is it still under your Control?

    – Deduplicator
    yesterday






  • 3





    Absolutely. expect nothing, expect anything.

    – S..
    16 hours ago














  • 7





    What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?

    – Vincent Savard
    yesterday








  • 34





    If you're expecting them, then by definition they're not unexpected.

    – Mason Wheeler
    yesterday






  • 13





    Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like <!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>

    – immibis
    yesterday






  • 4





    What does "external API" mean? Is it still under your Control?

    – Deduplicator
    yesterday






  • 3





    Absolutely. expect nothing, expect anything.

    – S..
    16 hours ago








7




7





What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?

– Vincent Savard
yesterday







What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?

– Vincent Savard
yesterday






34




34





If you're expecting them, then by definition they're not unexpected.

– Mason Wheeler
yesterday





If you're expecting them, then by definition they're not unexpected.

– Mason Wheeler
yesterday




13




13





Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like <!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>

– immibis
yesterday





Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like <!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>

– immibis
yesterday




4




4





What does "external API" mean? Is it still under your Control?

– Deduplicator
yesterday





What does "external API" mean? Is it still under your Control?

– Deduplicator
yesterday




3




3





Absolutely. expect nothing, expect anything.

– S..
16 hours ago





Absolutely. expect nothing, expect anything.

– S..
16 hours ago










8 Answers
8






active

oldest

votes


















64














You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP



Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.





From the comments I can see that perhaps my answer could use a bit of expansion.



By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.



One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".



Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.






share|improve this answer





















  • 14





    What is q.v. stand for ?

    – JonH
    yesterday






  • 10





    @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

    – andrewtweber
    yesterday






  • 5





    This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

    – leftaroundabout
    yesterday








  • 12





    @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

    – Paul
    yesterday






  • 6





    @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

    – Voo
    19 hours ago



















23














Yes, of course. But what makes you think the answer could be different?



You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.



However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.



But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.



So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.






share|improve this answer


























  • What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

    – mckenzm
    yesterday













  • @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

    – Doc Brown
    22 hours ago













  • hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

    – mckenzm
    6 hours ago



















10














The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.




EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:




3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.




This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.



See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.




Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.



This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.



This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.






share|improve this answer





















  • 5





    While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

    – Morgen
    yesterday






  • 6





    @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

    – supercat
    yesterday






  • 7





    @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

    – Doc Brown
    yesterday








  • 7





    @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

    – supercat
    yesterday






  • 5





    @supercat Exactly. JavaScript immediately comes to mind, for example...

    – Mason Wheeler
    yesterday



















9














In general, code should be constructed to uphold the at least the following constraints whenever practical:




  1. When given correct input, produce correct output.


  2. When given valid input (that may or may not be correct), produce valid output (likewise).


  3. When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.



In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.



While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.



Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.






share|improve this answer































    2














    Let's compare the two scenarios and try to come to a conclusion.



    Scenario 1
    Our application assumes the external API will behave as per the agreement.



    Scenario 2
    Our application assumes the external API can misbehave, hence add precautions.



    In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.



    If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.



    For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.



    So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.






    share|improve this answer








    New contributor




    lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.




























      1














      You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.



      Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.






      share|improve this answer































        1














        My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.



        The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?



        The two exceptions to testing everything in my world are:





        1. Performance after careful measurement of performance:




          • never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.




        2. When you have no clue what to do with an error




          • there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.




        Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:




        • the data type actully is a string


        • and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).


        • Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.



        The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.






        share|improve this answer































          -1














          To give a slightly differing opinion:
          I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
          I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.



          There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".






          share|improve this answer
























          • "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

            – Voo
            9 hours ago











          • @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

            – Stobor
            2 hours ago













          • @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

            – Stobor
            2 hours ago










          protected by gnat 9 hours ago



          Thank you for your interest in this question.
          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



          Would you like to answer one of these unanswered questions instead?














          8 Answers
          8






          active

          oldest

          votes








          8 Answers
          8






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          64














          You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP



          Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.





          From the comments I can see that perhaps my answer could use a bit of expansion.



          By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.



          One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".



          Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.






          share|improve this answer





















          • 14





            What is q.v. stand for ?

            – JonH
            yesterday






          • 10





            @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

            – andrewtweber
            yesterday






          • 5





            This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

            – leftaroundabout
            yesterday








          • 12





            @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

            – Paul
            yesterday






          • 6





            @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

            – Voo
            19 hours ago
















          64














          You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP



          Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.





          From the comments I can see that perhaps my answer could use a bit of expansion.



          By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.



          One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".



          Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.






          share|improve this answer





















          • 14





            What is q.v. stand for ?

            – JonH
            yesterday






          • 10





            @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

            – andrewtweber
            yesterday






          • 5





            This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

            – leftaroundabout
            yesterday








          • 12





            @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

            – Paul
            yesterday






          • 6





            @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

            – Voo
            19 hours ago














          64












          64








          64







          You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP



          Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.





          From the comments I can see that perhaps my answer could use a bit of expansion.



          By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.



          One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".



          Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.






          share|improve this answer















          You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP



          Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.





          From the comments I can see that perhaps my answer could use a bit of expansion.



          By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.



          One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".



          Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 7 hours ago

























          answered yesterday









          PaulPaul

          2,5751015




          2,5751015








          • 14





            What is q.v. stand for ?

            – JonH
            yesterday






          • 10





            @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

            – andrewtweber
            yesterday






          • 5





            This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

            – leftaroundabout
            yesterday








          • 12





            @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

            – Paul
            yesterday






          • 6





            @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

            – Voo
            19 hours ago














          • 14





            What is q.v. stand for ?

            – JonH
            yesterday






          • 10





            @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

            – andrewtweber
            yesterday






          • 5





            This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

            – leftaroundabout
            yesterday








          • 12





            @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

            – Paul
            yesterday






          • 6





            @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

            – Voo
            19 hours ago








          14




          14





          What is q.v. stand for ?

          – JonH
          yesterday





          What is q.v. stand for ?

          – JonH
          yesterday




          10




          10





          @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

          – andrewtweber
          yesterday





          @JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.

          – andrewtweber
          yesterday




          5




          5





          This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

          – leftaroundabout
          yesterday







          This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.

          – leftaroundabout
          yesterday






          12




          12





          @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

          – Paul
          yesterday





          @leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.

          – Paul
          yesterday




          6




          6





          @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

          – Voo
          19 hours ago





          @leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.

          – Voo
          19 hours ago













          23














          Yes, of course. But what makes you think the answer could be different?



          You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.



          However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.



          But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.



          So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.






          share|improve this answer


























          • What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

            – mckenzm
            yesterday













          • @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

            – Doc Brown
            22 hours ago













          • hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

            – mckenzm
            6 hours ago
















          23














          Yes, of course. But what makes you think the answer could be different?



          You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.



          However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.



          But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.



          So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.






          share|improve this answer


























          • What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

            – mckenzm
            yesterday













          • @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

            – Doc Brown
            22 hours ago













          • hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

            – mckenzm
            6 hours ago














          23












          23








          23







          Yes, of course. But what makes you think the answer could be different?



          You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.



          However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.



          But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.



          So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.






          share|improve this answer















          Yes, of course. But what makes you think the answer could be different?



          You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.



          However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.



          But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.



          So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited yesterday

























          answered yesterday









          Doc BrownDoc Brown

          131k23240380




          131k23240380













          • What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

            – mckenzm
            yesterday













          • @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

            – Doc Brown
            22 hours ago













          • hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

            – mckenzm
            6 hours ago



















          • What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

            – mckenzm
            yesterday













          • @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

            – Doc Brown
            22 hours ago













          • hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

            – mckenzm
            6 hours ago

















          What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

          – mckenzm
          yesterday







          What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.

          – mckenzm
          yesterday















          @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

          – Doc Brown
          22 hours ago







          @mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?

          – Doc Brown
          22 hours ago















          hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

          – mckenzm
          6 hours ago





          hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.

          – mckenzm
          6 hours ago











          10














          The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.




          EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:




          3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.




          This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.



          See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.




          Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.



          This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.



          This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.






          share|improve this answer





















          • 5





            While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

            – Morgen
            yesterday






          • 6





            @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

            – supercat
            yesterday






          • 7





            @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

            – Doc Brown
            yesterday








          • 7





            @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

            – supercat
            yesterday






          • 5





            @supercat Exactly. JavaScript immediately comes to mind, for example...

            – Mason Wheeler
            yesterday
















          10














          The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.




          EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:




          3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.




          This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.



          See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.




          Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.



          This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.



          This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.






          share|improve this answer





















          • 5





            While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

            – Morgen
            yesterday






          • 6





            @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

            – supercat
            yesterday






          • 7





            @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

            – Doc Brown
            yesterday








          • 7





            @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

            – supercat
            yesterday






          • 5





            @supercat Exactly. JavaScript immediately comes to mind, for example...

            – Mason Wheeler
            yesterday














          10












          10








          10







          The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.




          EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:




          3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.




          This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.



          See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.




          Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.



          This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.



          This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.






          share|improve this answer















          The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.




          EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:




          3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.




          This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.



          See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.




          Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.



          This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.



          This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 8 hours ago

























          answered yesterday









          Mason WheelerMason Wheeler

          74.4k18213298




          74.4k18213298








          • 5





            While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

            – Morgen
            yesterday






          • 6





            @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

            – supercat
            yesterday






          • 7





            @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

            – Doc Brown
            yesterday








          • 7





            @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

            – supercat
            yesterday






          • 5





            @supercat Exactly. JavaScript immediately comes to mind, for example...

            – Mason Wheeler
            yesterday














          • 5





            While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

            – Morgen
            yesterday






          • 6





            @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

            – supercat
            yesterday






          • 7





            @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

            – Doc Brown
            yesterday








          • 7





            @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

            – supercat
            yesterday






          • 5





            @supercat Exactly. JavaScript immediately comes to mind, for example...

            – Mason Wheeler
            yesterday








          5




          5





          While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

          – Morgen
          yesterday





          While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.

          – Morgen
          yesterday




          6




          6





          @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

          – supercat
          yesterday





          @Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.

          – supercat
          yesterday




          7




          7





          @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

          – Doc Brown
          yesterday







          @supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)

          – Doc Brown
          yesterday






          7




          7





          @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

          – supercat
          yesterday





          @DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.

          – supercat
          yesterday




          5




          5





          @supercat Exactly. JavaScript immediately comes to mind, for example...

          – Mason Wheeler
          yesterday





          @supercat Exactly. JavaScript immediately comes to mind, for example...

          – Mason Wheeler
          yesterday











          9














          In general, code should be constructed to uphold the at least the following constraints whenever practical:




          1. When given correct input, produce correct output.


          2. When given valid input (that may or may not be correct), produce valid output (likewise).


          3. When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.



          In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.



          While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.



          Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.






          share|improve this answer




























            9














            In general, code should be constructed to uphold the at least the following constraints whenever practical:




            1. When given correct input, produce correct output.


            2. When given valid input (that may or may not be correct), produce valid output (likewise).


            3. When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.



            In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.



            While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.



            Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.






            share|improve this answer


























              9












              9








              9







              In general, code should be constructed to uphold the at least the following constraints whenever practical:




              1. When given correct input, produce correct output.


              2. When given valid input (that may or may not be correct), produce valid output (likewise).


              3. When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.



              In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.



              While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.



              Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.






              share|improve this answer













              In general, code should be constructed to uphold the at least the following constraints whenever practical:




              1. When given correct input, produce correct output.


              2. When given valid input (that may or may not be correct), produce valid output (likewise).


              3. When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.



              In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.



              While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.



              Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered yesterday









              supercatsupercat

              6,9301725




              6,9301725























                  2














                  Let's compare the two scenarios and try to come to a conclusion.



                  Scenario 1
                  Our application assumes the external API will behave as per the agreement.



                  Scenario 2
                  Our application assumes the external API can misbehave, hence add precautions.



                  In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.



                  If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.



                  For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.



                  So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.






                  share|improve this answer








                  New contributor




                  lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.

























                    2














                    Let's compare the two scenarios and try to come to a conclusion.



                    Scenario 1
                    Our application assumes the external API will behave as per the agreement.



                    Scenario 2
                    Our application assumes the external API can misbehave, hence add precautions.



                    In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.



                    If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.



                    For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.



                    So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.






                    share|improve this answer








                    New contributor




                    lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.























                      2












                      2








                      2







                      Let's compare the two scenarios and try to come to a conclusion.



                      Scenario 1
                      Our application assumes the external API will behave as per the agreement.



                      Scenario 2
                      Our application assumes the external API can misbehave, hence add precautions.



                      In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.



                      If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.



                      For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.



                      So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.






                      share|improve this answer








                      New contributor




                      lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.










                      Let's compare the two scenarios and try to come to a conclusion.



                      Scenario 1
                      Our application assumes the external API will behave as per the agreement.



                      Scenario 2
                      Our application assumes the external API can misbehave, hence add precautions.



                      In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.



                      If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.



                      For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.



                      So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.







                      share|improve this answer








                      New contributor




                      lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.









                      share|improve this answer



                      share|improve this answer






                      New contributor




                      lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.









                      answered yesterday









                      lkamallkamal

                      1413




                      1413




                      New contributor




                      lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.





                      New contributor





                      lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.






                      lkamal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.























                          1














                          You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.



                          Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.






                          share|improve this answer




























                            1














                            You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.



                            Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.






                            share|improve this answer


























                              1












                              1








                              1







                              You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.



                              Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.






                              share|improve this answer













                              You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.



                              Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered yesterday









                              StarTrekRedneckStarTrekRedneck

                              1783




                              1783























                                  1














                                  My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.



                                  The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?



                                  The two exceptions to testing everything in my world are:





                                  1. Performance after careful measurement of performance:




                                    • never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.




                                  2. When you have no clue what to do with an error




                                    • there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.




                                  Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:




                                  • the data type actully is a string


                                  • and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).


                                  • Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.



                                  The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.






                                  share|improve this answer




























                                    1














                                    My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.



                                    The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?



                                    The two exceptions to testing everything in my world are:





                                    1. Performance after careful measurement of performance:




                                      • never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.




                                    2. When you have no clue what to do with an error




                                      • there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.




                                    Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:




                                    • the data type actully is a string


                                    • and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).


                                    • Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.



                                    The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.






                                    share|improve this answer


























                                      1












                                      1








                                      1







                                      My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.



                                      The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?



                                      The two exceptions to testing everything in my world are:





                                      1. Performance after careful measurement of performance:




                                        • never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.




                                      2. When you have no clue what to do with an error




                                        • there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.




                                      Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:




                                      • the data type actully is a string


                                      • and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).


                                      • Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.



                                      The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.






                                      share|improve this answer













                                      My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.



                                      The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?



                                      The two exceptions to testing everything in my world are:





                                      1. Performance after careful measurement of performance:




                                        • never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.




                                      2. When you have no clue what to do with an error




                                        • there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.




                                      Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:




                                      • the data type actully is a string


                                      • and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).


                                      • Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.



                                      The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered 10 hours ago









                                      ghellquistghellquist

                                      1214




                                      1214























                                          -1














                                          To give a slightly differing opinion:
                                          I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
                                          I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.



                                          There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".






                                          share|improve this answer
























                                          • "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

                                            – Voo
                                            9 hours ago











                                          • @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

                                            – Stobor
                                            2 hours ago













                                          • @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

                                            – Stobor
                                            2 hours ago
















                                          -1














                                          To give a slightly differing opinion:
                                          I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
                                          I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.



                                          There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".






                                          share|improve this answer
























                                          • "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

                                            – Voo
                                            9 hours ago











                                          • @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

                                            – Stobor
                                            2 hours ago













                                          • @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

                                            – Stobor
                                            2 hours ago














                                          -1












                                          -1








                                          -1







                                          To give a slightly differing opinion:
                                          I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
                                          I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.



                                          There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".






                                          share|improve this answer













                                          To give a slightly differing opinion:
                                          I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
                                          I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.



                                          There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered 16 hours ago









                                          Christian SauerChristian Sauer

                                          749415




                                          749415













                                          • "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

                                            – Voo
                                            9 hours ago











                                          • @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

                                            – Stobor
                                            2 hours ago













                                          • @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

                                            – Stobor
                                            2 hours ago



















                                          • "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

                                            – Voo
                                            9 hours ago











                                          • @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

                                            – Stobor
                                            2 hours ago













                                          • @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

                                            – Stobor
                                            2 hours ago

















                                          "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

                                          – Voo
                                          9 hours ago





                                          "I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.

                                          – Voo
                                          9 hours ago













                                          @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

                                          – Stobor
                                          2 hours ago







                                          @Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)

                                          – Stobor
                                          2 hours ago















                                          @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

                                          – Stobor
                                          2 hours ago





                                          @ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.

                                          – Stobor
                                          2 hours ago





                                          protected by gnat 9 hours ago



                                          Thank you for your interest in this question.
                                          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                          Would you like to answer one of these unanswered questions instead?



                                          Popular posts from this blog

                                          Statuo de Libereco

                                          Tanganjiko

                                          Liste der Baudenkmäler in Enneberg