Should you guard against unexpected values from external APIs?
Lets say you are coding a function that takes input from an external API MyAPI
.
That external API MyAPI
has a contract that states it will return a string
or a number
.
Is it recommended to guard against things like null
, undefined
, boolean
, etc. even though it's not part of the API of MyAPI
? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?
I'm thinking in relation to the Robustness Principle.
design api api-design web-services functions
|
show 7 more comments
Lets say you are coding a function that takes input from an external API MyAPI
.
That external API MyAPI
has a contract that states it will return a string
or a number
.
Is it recommended to guard against things like null
, undefined
, boolean
, etc. even though it's not part of the API of MyAPI
? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?
I'm thinking in relation to the Robustness Principle.
design api api-design web-services functions
7
What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?
– Vincent Savard
yesterday
34
If you're expecting them, then by definition they're not unexpected.
– Mason Wheeler
yesterday
13
Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like<!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>
– immibis
yesterday
4
What does "external API" mean? Is it still under your Control?
– Deduplicator
yesterday
3
Absolutely. expect nothing, expect anything.
– S..
16 hours ago
|
show 7 more comments
Lets say you are coding a function that takes input from an external API MyAPI
.
That external API MyAPI
has a contract that states it will return a string
or a number
.
Is it recommended to guard against things like null
, undefined
, boolean
, etc. even though it's not part of the API of MyAPI
? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?
I'm thinking in relation to the Robustness Principle.
design api api-design web-services functions
Lets say you are coding a function that takes input from an external API MyAPI
.
That external API MyAPI
has a contract that states it will return a string
or a number
.
Is it recommended to guard against things like null
, undefined
, boolean
, etc. even though it's not part of the API of MyAPI
? In particular, since you have no control over that API you cannot make the guarantee through something like static type analysis so it's better to be safe than sorry?
I'm thinking in relation to the Robustness Principle.
design api api-design web-services functions
design api api-design web-services functions
edited 57 mins ago
jpmc26
3,84921735
3,84921735
asked yesterday
Adam ThompsonAdam Thompson
35239
35239
7
What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?
– Vincent Savard
yesterday
34
If you're expecting them, then by definition they're not unexpected.
– Mason Wheeler
yesterday
13
Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like<!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>
– immibis
yesterday
4
What does "external API" mean? Is it still under your Control?
– Deduplicator
yesterday
3
Absolutely. expect nothing, expect anything.
– S..
16 hours ago
|
show 7 more comments
7
What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?
– Vincent Savard
yesterday
34
If you're expecting them, then by definition they're not unexpected.
– Mason Wheeler
yesterday
13
Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like<!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>
– immibis
yesterday
4
What does "external API" mean? Is it still under your Control?
– Deduplicator
yesterday
3
Absolutely. expect nothing, expect anything.
– S..
16 hours ago
7
7
What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?
– Vincent Savard
yesterday
What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?
– Vincent Savard
yesterday
34
34
If you're expecting them, then by definition they're not unexpected.
– Mason Wheeler
yesterday
If you're expecting them, then by definition they're not unexpected.
– Mason Wheeler
yesterday
13
13
Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like
<!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>
– immibis
yesterday
Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like
<!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>
– immibis
yesterday
4
4
What does "external API" mean? Is it still under your Control?
– Deduplicator
yesterday
What does "external API" mean? Is it still under your Control?
– Deduplicator
yesterday
3
3
Absolutely. expect nothing, expect anything.
– S..
16 hours ago
Absolutely. expect nothing, expect anything.
– S..
16 hours ago
|
show 7 more comments
8 Answers
8
active
oldest
votes
You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP
Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.
From the comments I can see that perhaps my answer could use a bit of expansion.
By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.
One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".
Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.
14
What is q.v. stand for ?
– JonH
yesterday
10
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
5
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
12
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
6
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
|
show 6 more comments
Yes, of course. But what makes you think the answer could be different?
You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.
However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.
But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.
So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
add a comment |
The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.
EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:
3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.
This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.
See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.
Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.
This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.
This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.
5
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
6
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
7
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
7
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
5
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
|
show 3 more comments
In general, code should be constructed to uphold the at least the following constraints whenever practical:
When given correct input, produce correct output.
When given valid input (that may or may not be correct), produce valid output (likewise).
When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.
In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.
While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.
Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.
add a comment |
Let's compare the two scenarios and try to come to a conclusion.
Scenario 1
Our application assumes the external API will behave as per the agreement.
Scenario 2
Our application assumes the external API can misbehave, hence add precautions.
In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.
If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.
For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.
So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.
New contributor
add a comment |
You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.
Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.
add a comment |
My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.
The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?
The two exceptions to testing everything in my world are:
Performance after careful measurement of performance:
- never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.
When you have no clue what to do with an error
- there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.
Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:
the data type actully is a string
and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).
Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.
The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.
add a comment |
To give a slightly differing opinion:
I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.
There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
add a comment |
protected by gnat 9 hours ago
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
8 Answers
8
active
oldest
votes
8 Answers
8
active
oldest
votes
active
oldest
votes
active
oldest
votes
You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP
Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.
From the comments I can see that perhaps my answer could use a bit of expansion.
By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.
One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".
Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.
14
What is q.v. stand for ?
– JonH
yesterday
10
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
5
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
12
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
6
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
|
show 6 more comments
You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP
Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.
From the comments I can see that perhaps my answer could use a bit of expansion.
By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.
One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".
Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.
14
What is q.v. stand for ?
– JonH
yesterday
10
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
5
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
12
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
6
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
|
show 6 more comments
You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP
Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.
From the comments I can see that perhaps my answer could use a bit of expansion.
By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.
One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".
Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.
You should never trust the inputs to your software, regardless of source. Not only validating the types is important, but also ranges of input and the business logic as well. Per a comment, this is well described by OWASP
Failing to do so will at best leave you with garbage data that you have to later clean up, but at worst you'll leave an opportunity for malicious exploits if that upstream service gets compromised in some fashion (q.v. the Target hack). The range of problems in between includes getting your application in an unrecoverable state.
From the comments I can see that perhaps my answer could use a bit of expansion.
By "never trust the inputs", I simply mean that you can't assume that you'll always receive valid and trustworthy information from upstream or downstream systems, and therefore you should always sanitize that input to the best of your ability, or reject it.
One argument surfaced in the comments I'll address by way of example. While yes, you have to trust your OS to some degree, it's not unreasonable to, for example, reject the results of a random number generator if you ask it for a number between 1 and 10 and it responds with "bob".
Similarly, in the case of the OP, you should definitely ensure your application is only accepting valid input from the upstream service. What you do when it's not OK is up to you, and depends a great deal on the actual business function that you're trying to accomplish, but minimally you'd log it for later debugging and otherwise ensure that your application doesn't go into an unrecoverable or insecure state. While you can never know every possible input someone/something might give you, you certainly can limit what's allowable based on the business requirements and do some form of input whitelisting based on that.
edited 7 hours ago
answered yesterday
PaulPaul
2,5751015
2,5751015
14
What is q.v. stand for ?
– JonH
yesterday
10
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
5
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
12
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
6
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
|
show 6 more comments
14
What is q.v. stand for ?
– JonH
yesterday
10
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
5
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
12
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
6
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
14
14
What is q.v. stand for ?
– JonH
yesterday
What is q.v. stand for ?
– JonH
yesterday
10
10
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
@JonH basically "see also"... the Target hack is an example that he is referencing en.oxforddictionaries.com/definition/q.v.
– andrewtweber
yesterday
5
5
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
This answer is as it stands just doesn't make sense. It's infeasible to anticipate each and every way a third-party library might misbehave. If a library function's documentation explicitly assures that the result will always have some properties, then you should be able to rely on it that the designers ensured this property will actually hold. It's their responsibility to have a test suite that checks this kind of thing, and submit a bug fix in case a situation is encountered where it doesn't. You checking these properties in your own code is violating the DRY principle.
– leftaroundabout
yesterday
12
12
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
@leftaroundabout no, but you should be able to predict all valid things your application can accept and reject the rest.
– Paul
yesterday
6
6
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
@leftaroundabout It's not about distrusting everything, it's about distrusting external untrusted sources. This is all about threat modelling. If you haven't done that your software isn't secure (how can it be, if you never even thought about against what kind of actors and threats you want to secure your application?). For a run of the mill business software it's a reasonable default to assume that callers could be malicious, while it's rarely sensible to assume your OS is a threat.
– Voo
19 hours ago
|
show 6 more comments
Yes, of course. But what makes you think the answer could be different?
You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.
However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.
But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.
So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
add a comment |
Yes, of course. But what makes you think the answer could be different?
You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.
However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.
But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.
So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
add a comment |
Yes, of course. But what makes you think the answer could be different?
You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.
However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.
But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.
So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.
Yes, of course. But what makes you think the answer could be different?
You surely don't want to let your program behave in some unpredictable manner in case the API does not return what the contract says, don't you? So at least you have to deal with such a behaviour somehow. A minimal form of error handling is always worth the (very minimal!) effort, and there is absolutely no excuse for not implementing something like this.
However, how much effort you should invest to deal with such a case is heavily case dependent and can only be answered in context of your system. Often, a short log entry and letting the application end gracefully can be enough. Sometimes, you will be better off to implement some detailed exception handling, dealing with different forms of "wrong" return values, and maybe have to implement some fallback strategy.
But it makes a hell of a difference if you are writing just some inhouse spreadsheet formatting application, to be used by less than 10 people and where the financial impact of an application crash is quite low, or if you are creating a new autonomous car driving system, where an application crash may cost lives.
So there is no shortcut against reflecting about what you are doing, using your common sense is always mandatory.
edited yesterday
answered yesterday
Doc BrownDoc Brown
131k23240380
131k23240380
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
add a comment |
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
What to do is another decision. You may have a fail over solution. Anything asynchronous could be retried before creating an exception log (or dead letter). An active alert to the vendor or provider may be an option if the issue persists.
– mckenzm
yesterday
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
@mckenzm: the fact the OP asks a question where the literal answer can obviously be only "yes" is IMHO a sign they may not just be interested in a literal answer. It looks they are asking "is it necessary to guard against different forms of unexpected values from an API and deal with them differently"?
– Doc Brown
22 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
hmm, the crap/carp/die approach. Is it our fault for passing bad (but legal) requests? is the response possible, but not usable for us in particular? or is the response corrupt? Different scenarios, Now it does sound like homework.
– mckenzm
6 hours ago
add a comment |
The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.
EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:
3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.
This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.
See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.
Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.
This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.
This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.
5
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
6
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
7
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
7
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
5
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
|
show 3 more comments
The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.
EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:
3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.
This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.
See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.
Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.
This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.
This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.
5
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
6
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
7
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
7
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
5
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
|
show 3 more comments
The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.
EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:
3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.
This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.
See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.
Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.
This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.
This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.
The Robustness Principle--specifically, the "be liberal in what you accept" half of it--is a very bad idea in software. It was originally developed in the context of hardware, where physical constraints make engineering tolerances very important, but in software, when someone sends you malformed or otherwise improper input, you have two choices. You can either reject it, (preferably with an explanation as to what went wrong,) or you can try to figure out what it was supposed to mean.
EDIT: Turns out I was mistaken in the above statement. The Robustness Principle doesn't come from the world of hardware, but from Internet architecture, specifically RFC 1958. It states:
3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.
This is, plainly speaking, simply wrong from start to finish. It is difficult to conceive of a more wrongheaded notion of error handling than "discard faulty input silently without returning an error message," for the reasons given in this post.
See also the IETF paper The Harmful Consequences of the Robustness Principle for further elaboration on this point.
Never, never, never choose that second option unless you have resources equivalent to Google's Search team to throw at your project, because that's what it takes to come up with a computer program that does anything close to a decent job at that particular problem domain. (And even then, Google's suggestions feel like they're coming straight out of left field about half the time.) If you try to do so, what you'll end up with is a massive headache where your program will frequently try to interpret bad input as X, when what the sender really meant was Y.
This is bad for two reasons. The obvious one is because then you have bad data in your system. The less obvious one is that in many cases, neither you nor the sender will realize that anything went wrong until much later down the road when something blows up in your face, and then suddenly you have a big, expensive mess to fix and no idea what went wrong because the noticeable effect is so far removed from the root cause.
This is why the Fail Fast principle exists; save everyone involved the headache by applying it to your APIs.
edited 8 hours ago
answered yesterday
Mason WheelerMason Wheeler
74.4k18213298
74.4k18213298
5
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
6
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
7
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
7
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
5
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
|
show 3 more comments
5
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
6
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
7
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
7
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
5
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
5
5
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
While I agree with the principle of what you're saying, I think you're mistaken WRT the intent of the Robustness Principle. I've never seen it intended to mean, "accept bad data", only, "don't be excessively fiddly about good data". For example, if the input is a CSV file, the Robustness Principle wouldn't be a valid argument for trying to parse out dates in an unexpected format, but would support an argument that inferring colum order from a header row would be a good idea.
– Morgen
yesterday
6
6
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
@Morgen: The robustness principle was used to suggest that browsers should accept rather sloppy HTML, and led to deployed web sites being much sloppier than they would have been if browsers had demanded proper HTML. A big part of the problem there, though, was the use of a common format for human-generated and machine-generated content, as opposed to the use of separate human-editable and machine-parsable formats along with utilities to convert between them.
– supercat
yesterday
7
7
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
@supercat: nevertheless - or just hence - HTML and the WWW was extremely successful ;-)
– Doc Brown
yesterday
7
7
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
@DocBrown: A lot of really horrible things have become standards simply because they were the first approach that happened to be available when someone with a lot of clout needed to adopt something that met certain minimal criteria, and by the time they gained traction it was too late to select something better.
– supercat
yesterday
5
5
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
@supercat Exactly. JavaScript immediately comes to mind, for example...
– Mason Wheeler
yesterday
|
show 3 more comments
In general, code should be constructed to uphold the at least the following constraints whenever practical:
When given correct input, produce correct output.
When given valid input (that may or may not be correct), produce valid output (likewise).
When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.
In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.
While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.
Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.
add a comment |
In general, code should be constructed to uphold the at least the following constraints whenever practical:
When given correct input, produce correct output.
When given valid input (that may or may not be correct), produce valid output (likewise).
When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.
In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.
While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.
Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.
add a comment |
In general, code should be constructed to uphold the at least the following constraints whenever practical:
When given correct input, produce correct output.
When given valid input (that may or may not be correct), produce valid output (likewise).
When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.
In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.
While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.
Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.
In general, code should be constructed to uphold the at least the following constraints whenever practical:
When given correct input, produce correct output.
When given valid input (that may or may not be correct), produce valid output (likewise).
When given invalid input, process it without any side-effects beyond those caused by normal input or those which are defined as signalling an error.
In many situations, programs will essentially pass through various chunks of data without particularly caring about whether they are valid. If such chunks happen to contain invalid data, the program's output would likely contain invalid data as a consequence. Unless a program is specifically designed to validate all data, and guarantee that it will not produce invalid output even when given invalid input, programs that process its output should allow for the possibility of invalid data within it.
While validating data early on is often desirable, it's not always particularly practical. Among other things, if the validity of one chunk of data depends upon the contents of other chunks, and if the majority of of the data fed into some sequence of steps will get filtered out along the way, limiting validation to data which makes it through all stages may yield much better performance than trying to validate everything.
Further, even if a program is only expected to be given pre-validated data, it's often good to have it uphold the above constraints anyway whenever practical. Repeating full validation at every processing step would often be a major performance drain, but the limited amount of validation needed to uphold the above constraints may be much cheaper.
answered yesterday
supercatsupercat
6,9301725
6,9301725
add a comment |
add a comment |
Let's compare the two scenarios and try to come to a conclusion.
Scenario 1
Our application assumes the external API will behave as per the agreement.
Scenario 2
Our application assumes the external API can misbehave, hence add precautions.
In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.
If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.
For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.
So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.
New contributor
add a comment |
Let's compare the two scenarios and try to come to a conclusion.
Scenario 1
Our application assumes the external API will behave as per the agreement.
Scenario 2
Our application assumes the external API can misbehave, hence add precautions.
In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.
If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.
For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.
So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.
New contributor
add a comment |
Let's compare the two scenarios and try to come to a conclusion.
Scenario 1
Our application assumes the external API will behave as per the agreement.
Scenario 2
Our application assumes the external API can misbehave, hence add precautions.
In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.
If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.
For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.
So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.
New contributor
Let's compare the two scenarios and try to come to a conclusion.
Scenario 1
Our application assumes the external API will behave as per the agreement.
Scenario 2
Our application assumes the external API can misbehave, hence add precautions.
In general, there is a chance for any API or software to violate the agreements; may be due to a bug or unexpected conditions. Even an API might be having issues in the internal systems resulting in unexpected results.
If our program is written assuming the external API will adhere to the agreements and avoid adding any precautions; who will be the party facing the issues? It will be us, the ones who has written integration code.
For example, the null values that you have picked. Say, as per the API agreement the response should have not-null values; but if it is suddenly violated our program will result in NPEs.
So, I believe it will be better to make sure your application has some additional code to address unexpected scenarios.
New contributor
New contributor
answered yesterday
lkamallkamal
1413
1413
New contributor
New contributor
add a comment |
add a comment |
You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.
Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.
add a comment |
You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.
Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.
add a comment |
You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.
Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.
You should always validate incoming data -- user-entered or otherwise -- so you should have a process in place to handle when the data retrieved from this external API is invalid.
Generally speaking, any seam where extra-orgranizational systems meet should require authentication, authorization (if not defined simply by authentication), and validation.
answered yesterday
StarTrekRedneckStarTrekRedneck
1783
1783
add a comment |
add a comment |
My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.
The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?
The two exceptions to testing everything in my world are:
Performance after careful measurement of performance:
- never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.
When you have no clue what to do with an error
- there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.
Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:
the data type actully is a string
and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).
Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.
The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.
add a comment |
My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.
The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?
The two exceptions to testing everything in my world are:
Performance after careful measurement of performance:
- never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.
When you have no clue what to do with an error
- there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.
Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:
the data type actully is a string
and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).
Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.
The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.
add a comment |
My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.
The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?
The two exceptions to testing everything in my world are:
Performance after careful measurement of performance:
- never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.
When you have no clue what to do with an error
- there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.
Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:
the data type actully is a string
and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).
Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.
The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.
My take on this is to always, always check each and every input to my system. That means every parameter returned from an API should be checked, even if my program does not use it. I tend to as well check every parameter I send to an API for correctness. There are only two exceptions to this rule, see below.
The reason for testing is that if for some reason the API / input is incorrect my program cannot rely on anything. Maybe my program was linked to an old version of the API that does something different from what I believe? Maybe my program stumbled on a bug in the external program that has never before happened. Or even worse, happens all the time but no one cares! Maybe the external program is beeing fooled by a hacker to return stuff that can hurt my program or the system?
The two exceptions to testing everything in my world are:
Performance after careful measurement of performance:
- never optimize before you have measured. Testing all input / returned data most often takes a very small time compared to the actual call so removing it often saves little or nothing. I would still keep the error detection code, but comment it out, perhaps by a macro or simply commenting it away.
When you have no clue what to do with an error
- there are times, not often, when your design simply does not allow handling of the kind of error you would find. Maybe what you ought to do is log an error, but there is no error logging in the system. It is almost always possible to find some way to "remember" the error allowing at least you as a developer to later check for it. Error counters is one good thing to have in a system, even if you elect to not have logging.
Exactly how carefully to check inputs / return values is an important question. As example, if the API is said to return a string, I would check that:
the data type actully is a string
and that length is between min and max values. Always check strings for max size that my program can expect to handle (returning too large strings is a classical security problem in networked systems).
Some strings should be checked for "illegal" characters or content when that is relevant. If your program might send the string to say a database later, it is a good idea to be check for database attacks (search for SQL injection). These tests are best done at the borders of my system, where I can pinpoint where the attack came from and I can fail early. Doing a full SQL injection test might be difficult when strings are later combined, so that test should be done before calling the database, but if you can find some problems early it can be useful.
The reason for testing parameters I send to the API is to be sure that I get a correct result back. Again, doing these tests before calling an API might seem unnecessary but it takes very little performance and may catch errors in my program. Hence the tests are most valuable when developing a system (but nowadays every system seems to be in continous development). Depending on the parameters the tests can be more or less thorough but I tend to find that you can often set allowable min and max values on most parameters that my program could create. Perhaps a string should always have at least 2 characters and be a maximum of 2000 characters long? The min and maximum should be inside what the API allows as I know that my program will never use the full range of some parameters.
answered 10 hours ago
ghellquistghellquist
1214
1214
add a comment |
add a comment |
To give a slightly differing opinion:
I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.
There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
add a comment |
To give a slightly differing opinion:
I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.
There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
add a comment |
To give a slightly differing opinion:
I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.
There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".
To give a slightly differing opinion:
I think it can be acceptable to just work with the data you are given, even if it violates it's contract. This depends on the usage: It's something that MUST be a string for you, or is it something you are just displaying / does not use etc. In the latter case, simply accept it.
I have an API which just needs 1% of the data delivered by another api. I could not care less what kind of data are in the 99%, so I will never check it.
There has to be balance between "having errors because I do not check my inputs enough" and "I reject valid data because I am too strict".
answered 16 hours ago
Christian SauerChristian Sauer
749415
749415
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
add a comment |
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
"I have an API which just needs 1% of the data delivered by another api." This then opens up the question why your API expects a 100 times more data than it actually needs. If you need to store opaque data to pass on, you don't really have to be specific as to what it is and don't have to declare it in any specific format, in which case the caller wouldn't be violating your contract.
– Voo
9 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@Voo - My suspicion is that they are calling some external API (like "get weather details for city X") and then cherry-picking the data they need ("current temperature") and ignoring the rest of the returned data ("rainfall", "wind", "forecast temperature", "wind chill", etc...)
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
@ChristianSauer - I think you are not that far from what the wider consensus is - the 1% of the data that you use makes sense to check, but the 99% which you don't does not necessarily need to be checked. You only need to check the things which could trip your code up.
– Stobor
2 hours ago
add a comment |
protected by gnat 9 hours ago
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
7
What are the impacts of not handling those unexpected values if they are returned? Can you live with these impacts? Is it worth the complexity to handle those unexpected values to prevent having to deal with the impacts?
– Vincent Savard
yesterday
34
If you're expecting them, then by definition they're not unexpected.
– Mason Wheeler
yesterday
13
Remember the API isn't obligated to only give you valid JSON back (I'm assuming this is JSON). You could also get a reply like
<!doctype html><html><head><title>504 Gateway Timeout</title></head><body>The server was unable to process your request. Make sure you have typed the address correctly. If the problem persists, please try again later.</body></html>
– immibis
yesterday
4
What does "external API" mean? Is it still under your Control?
– Deduplicator
yesterday
3
Absolutely. expect nothing, expect anything.
– S..
16 hours ago