Circuit reliability issues


Closed Thread
Results 1 to 6 of 6
  1. #1
    Join Date
    Nov 2007
    Posts
    10

    Default Circuit reliability issues

    Dear All,

    I have been reading the posts on this forum for the past 3 years, today is the first time I post a message myself. Hopefully, the community of members here can help me solve my problem.

    I have designed a circuit designed to switch 16 outputs on and off. It is hooked up to a RS-485 network (not exactly but similar, using two lines idling at 24 volts and a third one for ground, each 24V line may be switched off to represent digital "ones" and "zeros"). Up to 128 switchers may be connected in parallel at any time. Each of these "switchers" is adressable so that the controller can "talk" to each and every one of them separately and switch on/off the apropriate output. So, in essence, the controller can switch on and off a maximum of 2048 outputs (128 switchers X 16 outputs).

    This whole network is used in multimedia applications and the configuration of the network will change from one time to the next, for instance, one application may require only 10 switchers whilst the next one may require 100.

    The prototypes I built all worked perfectly on my workbench and reliablitlity was 100%. Communication was rock-solid and, after much devellopment, I had 300 of these switchers manufactured.

    Here is where the problems started, Now that we are using them in the field, we are experiencing reliablitity issues. Out of 100 switchers, there is always 2-3 that will refuse to start-up, causing all kinds of problems and sometimes short-circuiting themselves.

    The problem is that the circuit behaves as if the Pic doesn't start at all, or hangs, causing the outputs to be in indefinite state and disturbing communication on the network. The circuit talks back to the controller by using a current-loop kind of communication system. Essentially dropping a 270-ohm resistor from 24 volts to ground to represent digital "ones" and "zeros". This is fine when communicating at 9600 bauds, because the timing of the "ones" is very short. However, if the Pic hangs, and somehow the transistor controlling this resistor remains switched on, the resistor will eventually burn-up, causing a short circuit in the network.

    This behavior is unpredictable. I may have a network with 100 switchers working perfectly at 3 PM. Turn it off, Turn it back on at 5PM and have 1 or 2 switchers not working anymore. Or, I may have a system installed for 3 days with everything working perfectly at all times.

    At first, I tought that the problem was related to manufacturing, that somehow the soldering of the Xtal, or trace contaminants on the boads were to be blamed. I had each board cleaned up, and then applied 5 coats of conformal coating to each of them, unfortunately, it did not solve the issue.

    I then suspected power supply problems but it doesn't seem to be the case. The circuit is powered by a clean 24Volts DC line and the voltage is regulated to 5 volts by a low-dropout regulator (LM2931AZ-5.0) with a 100uf can type cap upstream and a 47uf tentalum cap downstream, as per manufacturer's recommendations. A check with my scope shows a very clean power rail.

    The microcontroller is a Pic 16F877A, with both supply pins decoupled with a 0.1uf cap and the MCLR pin tied to the power supply rail through a 4.7K resistor. No output pin is left unused and they all have 10K pull down resistors.

    I am using a 4Mhz Xtal, with 20pf ceramic-disc caps, as close to the chip as can be. The whole circuit has a ground plane.

    More and more, I am suspecting that the problem is software-related. Specifically, I am wondering if I did not make an error in setting my configuration fuses.

    The symptoms of the problem will be as if the switcher starts to communicate to the controller, and then just stops. If that happens, the 270 ohm resistor will stay between 24 volts and ground and will either short-circuit and/or disrupt the current-loop communication. Another way it can fail is that it will just stay non-responsive, as if the Xtal did not start up. Since this is intermittent, and happens in the field, it is almost imposible for me to take readings with a scope when it happens. Please note that the switcher only communicates when asked, however this problem will pop up at power up, even before I ask the switchers anything. I am sending a dummy message at the start of my program, I read on this forum that it was preferable to do this when using the USART.

    I am setting the fuses in the Melabs programmer, not in the program itself. My fuses ar as follows:


    Oscillator: HS
    Watchdog timer: Enabled
    Power-up Timer: Enabled
    Brown-out Reset: Enabled
    Low voltage programming: Disabled
    Flash Program memory write: 0x1000-0x1FFF
    code: Protected
    Data EEPROM: Protected


    I am wondering if a programming blunder somehow puts the pic in reset, and the outputs remain in indefinite state? But then again, the problem seems to only affect the pins driving the two transistors controlling the 270 ohms resistors, the 16 transistors doing the switching are never affected. Or maybe the crystal doesn't start up? but why? Or maybe the 74ALS08 controlling the two "communication" transistors fails? I do have one unused pin on there that I forgot to tie to anything...

    I have attached the listing of the program to this message, as well as a JPEG file of the circuit.

    Any help or suggestions, be it hardware or software related will be more than welcome as I have been trying to solve this for the past 2 months and am now running out of ideas.

    Thanks and best regards,

    Patrice
    Attached Images Attached Images  
    Attached Files Attached Files

  2. #2
    Join Date
    Aug 2006
    Location
    Look, behind you.
    Posts
    2,818


    Did you find this post helpful? Yes | No

    Default

    Hi hkpatrice,
    have you tried installing a capacitor on your MCLR pin to ground to hold the PIC in a reset condition until the chip settles down internaly?
    I see no TRISE statement in code . . . Probably need this so chip knows input or outputs.
    Last edited by Archangel; - 22nd November 2007 at 06:39.
    If you do not believe in MAGIC, Consider how currency has value simply by printing it, and is then traded for real assets.
    .
    Gold is the money of kings, silver is the money of gentlemen, barter is the money of peasants - but debt is the money of slaves
    .
    There simply is no "Happy Spam" If you do it you will disappear from this forum.

  3. #3
    Join Date
    Nov 2007
    Posts
    10


    Did you find this post helpful? Yes | No

    Default

    Joe s. said:

    "Hi hkpatrice,
    have you tried installing a capacitor on your MCLR pin to ground to hold the PIC in a reset condition until the chip settles down internaly? I see (at least I think I see) you are feeding the mclr pin through a resistor from the 24 volt line and limiting the voltage with a zener, I am wondering if that zener is introducing noise from it's constant switching, cycling the voltage up/down. . . How does it work on a 5v regulated supply?"

    Hello Joe and thanks for the quick answer.

    -I haven't tried putting a cap on the MCLR line, however, I have the power-up timer enabled, doesn't that play a similar role?

    -The MCLR pin is fed through a resistor from the 5V rail, not the 24V one... And no zener there... Sorry for the quality of the BMP, it's the best I could do while staying under the 200Kb limit.

    -You are right about the missing TRISE statement! How could I miss that? Could that be the cause of the problems I've been experiencing?

    Thanks for the input!

  4. #4
    Join Date
    Sep 2004
    Location
    montreal, canada
    Posts
    6,898


    Did you find this post helpful? Yes | No

    Default

    can you try to use PrimoPDF to post your Schematic?
    http://www.primopdf.com/

    PrimoPDF is free and really nice, not sure 'bout the final PDF size so far

    It also seems to have something to do with the ISR, about DISABLE/ENABLE/RESUME 'round your ISR
    Last edited by mister_e; - 22nd November 2007 at 17:15.
    Steve

    It's not a bug, it's a random feature.
    There's no problem, only learning opportunities.

  5. #5
    Join Date
    Nov 2007
    Posts
    10


    Did you find this post helpful? Yes | No

    Default

    mister_e,

    Thanks for the suggestion, I tried convertingthe file to PDF but it still comes in at 370Kb...
    Still too much to post on here.

    Please tell me more about this interrupt handling issue, you seem to have spotted something that eludes me. Of course, If the problem was with the ISR, then it would cause the erratic behavior I've been experiencing...

    Thanks for the help!

  6. #6
    Join Date
    Sep 2004
    Location
    montreal, canada
    Posts
    6,898


    Did you find this post helpful? Yes | No

    Default

    Try to ZIP your PDF, or send it to my e-mail.
    Quote Originally Posted by MCS help file --- On Interrupt
    The most notable place to use DISABLE is right before the actual interrupt handler. Or the interrupt handler may be placed before the ON INTERRUPT statement as the interrupt flag is not checked before the first ON INTERRUPT in a program.
    Code:
    ON INTERRUPT GOTO serialin		' Declare interrupt handler routine
            '
            '
            '
            '
    ' Subroutines
    
    DISABLE					' Don't check for interrupts in this section
    
    getbuf:					' move the next character in buffer to bufchar
    
    	index_out = (index_out + 1)			' Increment index_out pointer (0 to 63)
    	IF index_out > (buffer_size-1) THEN index_out = 0	' Reset pointer if outside of buffer
    	ADDRESS = addressbuffer[index_out]			' Read buffer location
    	COMMAND = commandbuffer[index_out]
    RETURN
    
    
    error:					' Display error message if buffer has overrun
    	errflag = 0			' Reset the error flag
    	CREN = 0			' Disable continuous receive to clear overrun flag
    	CREN = 1			' Enable continuous receive
    	GOTO main		' Carry on
    	
    	
    ' Interrupt handler
            '        Where's the Disable???
    serialin:				' Buffer the character received
    IF PIR1.5=1 THEN                        'IF THIS IS A USART INTERRUPT....   
    	IF OERR THEN usart_error			' Check for USART errors
    	index_in = (index_in + 1)			' Increment index_in pointer (0 to 63)
    	IF index_in > (buffer_size-1) THEN index_in = 0	'Reset pointer if outside of buffer
    	IF index_in = index_out THEN buffer_error	' Check for buffer overrun	
    		HSERIN badparity,10,badparity,[addressbuffer[index_in],commandbuffer[index_in]]          		' Read USART and store character to next empty location
    	IF RCIF THEN serialin				' Check for another character while we're here
    ENDIF
    	
    RESUME					' Return to program
    
    badparity:
    IF index_in=0 THEN
    index_in= (buffer_size-1)
    ELSE
    	index_in = (index_in - 1) 	 ' Move pointer back to avoid corrupting the buffer.
    ENDIF
    GOTO main
    
            '        You don't even need it as you used HSER_CLROERR define.
            '        Let's say CLROERR don't work, YOU DON'T WANT TO use a GOTO inside the ISR 
            '        or you'll experiment a stack overflow/underflow one day or another.  Same rule apply in badparity sub.. which is also called somewhere in the main loop
    usart_error:
      errflag=1
      GOTO main
    
    
    buffer_error:
    	errflag.1 = 1		' Set the error flag for software
    IF index_in=0 THEN
    index_in= (buffer_size-1)
    ELSE
    	index_in = (index_in - 1) 	 ' Move pointer back to avoid corrupting the buffer.
    ENDIF	
    	
    RESUME					' Return to program
    MAYBE it's safe to place Disable BEFORE the ISR as long as there's another ENABLE somewhere after, but i don't see any

    Maybe there's something else, those are the first who jump in my face.

    HTH
    Steve

    It's not a bug, it's a random feature.
    There's no problem, only learning opportunities.

Similar Threads

  1. Comparator circuit thoughts....
    By kevlar129bp in forum mel PIC BASIC Pro
    Replies: 15
    Last Post: - 24th October 2009, 06:04
  2. Circuit Design Question
    By bradb in forum mel PIC BASIC Pro
    Replies: 6
    Last Post: - 11th August 2009, 08:18
  3. Short circuit portection circuit ?
    By iugmoh in forum Schematics
    Replies: 1
    Last Post: - 21st December 2008, 21:33
  4. Replies: 1
    Last Post: - 10th December 2007, 23:57
  5. Replies: 3
    Last Post: - 29th October 2006, 09:16

Members who have read this thread : 1

You do not have permission to view the list of names.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts