• 0 Posts
  • 16 Comments
Joined 2 years ago
cake
Cake day: June 30th, 2023

help-circle
  • The things I am talking about are applied to the development process before you start writing code. Rules from NASA’s the power of 10, MISRA, ISO-26262, DO-178C, and so on, as well as the general experience and understanding of the data flow or memory management. Stuff like that you fundamentally can’t apply to a system that takes random pieces of text from the Internet and puts it into a string until it looks like something.

    There is an enormous gray zone between so called good code (which might actually not exist), and bad code that doesn’t work and has obvious problems from the beginning. That’s the most dangerous part of it, when your code looks like something that can pass your “Turing test”, that’s where the most insidious parts get introduced, and since you completely removed that planning part and all the written in blood rules it introduced, and you eliminated experience element, you basically have to treat all the code as the most malicious parts of it, and since it’s impossible, you just dropped your standards to the ground.

    It’s like pouring sugar into concrete. When there is a lot of it, it’s obvious and concrete will never set. When there is just enough of it, it will, but structurally it will be undetectably weaker, and you have no idea when it will crack.




  • Any human written code can and will introduce UB.

    And there is enormous amount of safeguards, tricks, practices and tools we come up with to combat it. All of those are categorically unavailable to an autocomplete tool, or a tool who exclusively uses autocomplete tool to code.

    Also I don’t see how you will take more that 5 second to verify that a given function does not exist. It has happen to me, llm suggesting unexisting function. And searching by function name in the docs is instantaneous.

    Which means you can work with documentation. Which means you really, really don’t need the middle layer, like, at all.

    I haven’t run into any of those catastrophic issues.

    Glad you didn’t, but also, I’ve reviewed enough generated code to know that a lot of the time people think they’re OK, when in reality they just introduced an esoteric memory leak in a critical section. People who didn’t do it by themselves, but did it because LLM told them to.

    I you don’t want to use it don’t.

    It’s not about me. It’s about other people introducing shit into our collective lives, making it worse.


  • That’s why you use unit test and integration test.

    Good start, but not even close to being enough. What if code introduces UB? Unless you specifically look for that, and nobody does, neither unit nor on-target tests will find it. What if it’s drastically ineffective? What if there are weird and unusual corner cases?
    Now you spend more time looking for all of that and designing tests that you didn’t need to do if you had proper practices from the beginning.

    It would probably a nice idea to do some kind of turing test, a put a blind test to distinguish the AI written part of some code, and see how precisely people can tell it apart.

    But that’s worse! You do realise how that’s worse, right? You lose all the external ways to validate the code, now you have to treat all the code as malicious.

    For instance, to seek for specific functions in C# extensive libraries.

    And spend twice as much time trying to understand why can’t you find a function that your LLM just invented with absolute certainty of a fancy autocomplete. And if that’s an easy task for you, well, then why do you need this middle layer of randomness. I can’t think of a reason why not to search in the documentation instead of introducing this weird game of “will it lie to me”







  • I maintain strong conviction that if a good programmer uses llm in their work, they just add more work for themselves, and if less than good one does it, they add new exciting and difficult to find bugs, while maintaining false confidence in their code and themselves.
    I have seen so much code that looks good on first, second, and third glance, but actually is full of shit, and I was able to find that shit by doing external validation like talking to the dev or brainstorming the ways to test it, the things you categorically cannot do with unreliable random words generator.